言語条件付き交通シミュレーション生成

要旨

シミュレーションは現代の自動運転開発の基盤を形成しています。シミュレータは、人間や車両、その環境を危険にさらすことなく、運転システムの開発、テスト、改善を支援します。しかし、シミュレータは大きな課題に直面しています。それは、現実的でスケーラブルでありながら興味深いコンテンツに依存していることです。最近のレンダリングやシーン再構成の進歩により、静的なシーンアセットの作成は大きく進歩しましたが、それらのレイアウト、ダイナミクス、および動作のモデリングは依然として困難です。本研究では、動的な交通シーン生成のための教師信号として言語に着目します。我々のモデルであるLCTGenは、大規模言語モデルとトランスフォーマーベースのデコーダアーキテクチャを組み合わせており、マップデータセットから可能性の高い位置を選択し、初期の交通分布と各車両のダイナミクスを生成します。LCTGenは、無条件および条件付きの交通シーン生成において、リアリズムと忠実度の点で従来の研究を上回ります。コードとビデオはhttps://ariostgx.github.io/lctgenで公開予定です。

English

Simulation forms the backbone of modern self-driving development. Simulators help develop, test, and improve driving systems without putting humans, vehicles, or their environment at risk. However, simulators face a major challenge: They rely on realistic, scalable, yet interesting content. While recent advances in rendering and scene reconstruction make great strides in creating static scene assets, modeling their layout, dynamics, and behaviors remains challenging. In this work, we turn to language as a source of supervision for dynamic traffic scene generation. Our model, LCTGen, combines a large language model with a transformer-based decoder architecture that selects likely map locations from a dataset of maps, and produces an initial traffic distribution, as well as the dynamics of each vehicle. LCTGen outperforms prior work in both unconditional and conditional traffic scene generation in terms of realism and fidelity. Code and video will be available at https://ariostgx.github.io/lctgen.

言語条件付き交通シミュレーション生成

Language Conditioned Traffic Generation

要旨

Support