語言條件下的交通生成

摘要

模擬是現代自動駕駛開發的支柱。模擬器有助於開發、測試和改進駕駛系統，而無需將人類、車輛或其環境置於危險之中。然而，模擬器面臨一個重大挑戰：它們依賴逼真、可擴展且有趣的內容。雖然渲染和場景重建方面的最新進展在創建靜態場景資產方面取得了巨大進展，但對其佈局、動態和行為進行建模仍然具有挑戰性。在這項工作中，我們將語言作為動態交通場景生成的監督來源。我們的模型LCTGen結合了一個大型語言模型和基於Transformer的解碼器架構，從地圖數據集中選擇可能的地圖位置，並生成初始的交通分佈，以及每輛車輛的動態。在逼真度和忠實度方面，LCTGen在無條件和有條件的交通場景生成方面均優於先前的工作。代碼和視頻將可在https://ariostgx.github.io/lctgen 上找到。

English

Simulation forms the backbone of modern self-driving development. Simulators help develop, test, and improve driving systems without putting humans, vehicles, or their environment at risk. However, simulators face a major challenge: They rely on realistic, scalable, yet interesting content. While recent advances in rendering and scene reconstruction make great strides in creating static scene assets, modeling their layout, dynamics, and behaviors remains challenging. In this work, we turn to language as a source of supervision for dynamic traffic scene generation. Our model, LCTGen, combines a large language model with a transformer-based decoder architecture that selects likely map locations from a dataset of maps, and produces an initial traffic distribution, as well as the dynamics of each vehicle. LCTGen outperforms prior work in both unconditional and conditional traffic scene generation in terms of realism and fidelity. Code and video will be available at https://ariostgx.github.io/lctgen.

語言條件下的交通生成

Language Conditioned Traffic Generation

摘要

Support