DriveGen3D:利用高效视频扩散技术增强前馈驾驶场景生成
DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion
October 17, 2025
作者: Weijie Wang, Jiagang Zhu, Zeyu Zhang, Xiaofeng Wang, Zheng Zhu, Guosheng Zhao, Chaojun Ni, Haoxiao Wang, Guan Huang, Xinze Chen, Yukun Zhou, Wenkang Qin, Duochao Shi, Haoyun Li, Guanghong Jia, Jiwen Lu
cs.AI
摘要
我们提出了DriveGen3D,一个创新框架,旨在生成高质量且高度可控的动态3D驾驶场景,以解决现有方法中的关键限制。当前驾驶场景合成方法要么因长时间生成而面临计算资源的高昂需求,要么仅专注于长时间视频合成而缺乏3D表示,或者局限于静态单场景重建。我们的工作通过多模态条件控制,将加速的长期视频生成与大规模动态场景重建相结合,填补了这一方法学上的空白。DriveGen3D引入了一个统一流程,包含两个专门组件:FastDrive-DiT,一种高效的视频扩散变换器,在文本和鸟瞰图(BEV)布局指导下实现高分辨率、时间连贯的视频合成;以及FastRecon3D,一个前馈重建模块,快速构建跨时间的3D高斯表示,确保时空一致性。这两个组件共同实现了实时生成扩展驾驶视频(最高可达424×800分辨率,12帧每秒)及相应的动态3D场景,在新视角合成上达到了SSIM 0.811和PSNR 22.84,同时保持了参数效率。
English
We present DriveGen3D, a novel framework for generating high-quality and
highly controllable dynamic 3D driving scenes that addresses critical
limitations in existing methodologies. Current approaches to driving scene
synthesis either suffer from prohibitive computational demands for extended
temporal generation, focus exclusively on prolonged video synthesis without 3D
representation, or restrict themselves to static single-scene reconstruction.
Our work bridges this methodological gap by integrating accelerated long-term
video generation with large-scale dynamic scene reconstruction through
multimodal conditional control. DriveGen3D introduces a unified pipeline
consisting of two specialized components: FastDrive-DiT, an efficient video
diffusion transformer for high-resolution, temporally coherent video synthesis
under text and Bird's-Eye-View (BEV) layout guidance; and FastRecon3D, a
feed-forward reconstruction module that rapidly builds 3D Gaussian
representations across time, ensuring spatial-temporal consistency. Together,
these components enable real-time generation of extended driving videos (up to
424times800 at 12 FPS) and corresponding dynamic 3D scenes, achieving SSIM
of 0.811 and PSNR of 22.84 on novel view synthesis, all while maintaining
parameter efficiency.