ChatPaper.aiChatPaper

CoIRL-AD:面向自动驾驶的潜在世界模型中的协作-竞争模仿-强化学习

CoIRL-AD: Collaborative-Competitive Imitation-Reinforcement Learning in Latent World Models for Autonomous Driving

October 14, 2025
作者: Xiaoji Zheng, Ziyuan Yang, Yanhao Chen, Yuhang Peng, Yuanrong Tang, Gengyuan Liu, Bokui Chen, Jiangtao Gong
cs.AI

摘要

仅通过模仿学习(IL)训练的端到端自动驾驶模型往往泛化能力较差。相比之下,强化学习(RL)通过奖励最大化促进探索,但面临样本效率低下和收敛不稳定等挑战。一个自然的解决方案是将IL与RL相结合。我们超越了传统的两阶段范式(先进行IL预训练,再进行RL微调),提出了CoIRL-AD,这是一个竞争性的双策略框架,使IL和RL代理在训练过程中能够互动。CoIRL-AD引入了一种基于竞争的机制,既促进了知识交换,又避免了梯度冲突。在nuScenes数据集上的实验表明,与基线相比,碰撞率降低了18%,同时在长尾场景中展现出更强的泛化能力和性能提升。代码已发布于:https://github.com/SEU-zxj/CoIRL-AD。
English
End-to-end autonomous driving models trained solely with imitation learning (IL) often suffer from poor generalization. In contrast, reinforcement learning (RL) promotes exploration through reward maximization but faces challenges such as sample inefficiency and unstable convergence. A natural solution is to combine IL and RL. Moving beyond the conventional two-stage paradigm (IL pretraining followed by RL fine-tuning), we propose CoIRL-AD, a competitive dual-policy framework that enables IL and RL agents to interact during training. CoIRL-AD introduces a competition-based mechanism that facilitates knowledge exchange while preventing gradient conflicts. Experiments on the nuScenes dataset show an 18% reduction in collision rate compared to baselines, along with stronger generalization and improved performance on long-tail scenarios. Code is available at: https://github.com/SEU-zxj/CoIRL-AD.
PDF42October 16, 2025