基於信號時序邏輯的多樣化可控擴散策略

摘要

生成逼真的模擬對於自動駕駛和人機互動等自主系統應用至關重要。然而，當前的駕駛模擬器在生成可控、多樣且符合規則的道路參與者行為方面仍存在困難：基於規則的模型無法產生多樣化的行為且需要精心調校，而基於學習的方法雖能從數據中模仿策略，卻未明確設計以遵循規則。此外，現實世界的數據集本質上是“單一結果”的，這使得學習方法難以生成多樣化的行為。本文中，我們利用信號時序邏輯（STL）和擴散模型來學習可控、多樣且具規則意識的策略。我們首先在真實世界數據上校準STL，然後使用軌跡優化生成多樣的合成數據，最後在增強的數據集上學習修正後的擴散策略。我們在NuScenes數據集上進行測試，與其他基線方法相比，我們的方法能夠生成最多樣化的合規軌跡，且運行時間僅為次優方法的1/17。在閉環測試中，我們的方法達到了最高的多樣性、規則滿足率以及最低的碰撞率。我們的方法能夠根據不同的STL參數在測試中生成多樣化的特徵。在人機相遇場景的案例研究中，我們的方法能夠生成多樣且接近理想軌跡的結果。註釋工具、增強數據集及代碼可在https://github.com/mengyuest/pSTL-diffusion-policy獲取。

English

Generating realistic simulations is critical for autonomous system applications such as self-driving and human-robot interactions. However, driving simulators nowadays still have difficulty in generating controllable, diverse, and rule-compliant behaviors for road participants: Rule-based models cannot produce diverse behaviors and require careful tuning, whereas learning-based methods imitate the policy from data but are not designed to follow the rules explicitly. Besides, the real-world datasets are by nature "single-outcome", making the learning method hard to generate diverse behaviors. In this paper, we leverage Signal Temporal Logic (STL) and Diffusion Models to learn controllable, diverse, and rule-aware policy. We first calibrate the STL on the real-world data, then generate diverse synthetic data using trajectory optimization, and finally learn the rectified diffusion policy on the augmented dataset. We test on the NuScenes dataset and our approach can achieve the most diverse rule-compliant trajectories compared to other baselines, with a runtime 1/17X to the second-best approach. In the closed-loop testing, our approach reaches the highest diversity, rule satisfaction rate, and the least collision rate. Our method can generate varied characteristics conditional on different STL parameters in testing. A case study on human-robot encounter scenarios shows our approach can generate diverse and closed-to-oracle trajectories. The annotation tool, augmented dataset, and code are available at https://github.com/mengyuest/pSTL-diffusion-policy.

基於信號時序邏輯的多樣化可控擴散策略

Diverse Controllable Diffusion Policy with Signal Temporal Logic

摘要

Support