基于三维生成模型的自回归布局生成方法重构

摘要

我们提出LaviGen框架，通过重构3D生成模型实现3D布局生成。与现有从文本描述推断物体布局的方法不同，LaviGen直接在原生3D空间中运行，将布局生成构建为自回归过程，显式建模物体间的几何关系与物理约束，从而生成具有连贯性与物理合理性的3D场景。为进一步优化该过程，我们提出改进的3D扩散模型，该模型融合场景、物体与指令信息，并采用双引导自推演蒸馏机制以提升效率与空间精度。在LayoutVLM基准上的大量实验表明，LaviGen实现了卓越的3D布局生成性能，其物理合理性较现有最优方法提升19%，计算速度加快65%。代码已开源：https://github.com/fenghora/LaviGen。

English

We introduce LaviGen, a framework that repurposes 3D generative models for 3D layout generation. Unlike previous methods that infer object layouts from textual descriptions, LaviGen operates directly in the native 3D space, formulating layout generation as an autoregressive process that explicitly models geometric relations and physical constraints among objects, producing coherent and physically plausible 3D scenes. To further enhance this process, we propose an adapted 3D diffusion model that integrates scene, object, and instruction information and employs a dual-guidance self-rollout distillation mechanism to improve efficiency and spatial accuracy. Extensive experiments on the LayoutVLM benchmark show LaviGen achieves superior 3D layout generation performance, with 19% higher physical plausibility than the state of the art and 65% faster computation. Our code is publicly available at https://github.com/fenghora/LaviGen.