SceneWeaver:一體化3D場景合成,配備可擴展且自我反思的智能體
SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
September 24, 2025
作者: Yandan Yang, Baoxiong Jia, Shujie Zhang, Siyuan Huang
cs.AI
摘要
隨著具身智能(Embodied AI)的興起,室內場景合成變得日益重要,這要求三維環境不僅視覺上逼真,還需物理上合理且功能多樣。儘管近期方法在視覺逼真度上有所進展,但它們通常受限於固定的場景類別,缺乏足夠的物體層次細節和物理一致性,並且難以對齊複雜的用戶指令。在本研究中,我們提出了SceneWeaver,這是一個反思性代理框架,通過基於工具的迭代細化統一了多樣的場景合成範式。其核心在於,SceneWeaver利用基於語言模型的規劃器,從一系列可擴展的場景生成工具中進行選擇,這些工具涵蓋了數據驅動的生成模型到視覺和基於大語言模型的方法,並以物理合理性、視覺真實性及與用戶輸入語義對齊的自我評估為指導。這種閉環的“推理-行動-反思”設計使得代理能夠識別語義不一致性,調用針對性工具,並在連續迭代中更新環境。在常見及開放詞彙房間類型上的大量實驗表明,SceneWeaver不僅在物理、視覺和語義指標上超越了先前方法,還能有效泛化至具有多樣指令的複雜場景,標誌著向通用三維環境生成邁進了一步。項目網站:https://scene-weaver.github.io/。
English
Indoor scene synthesis has become increasingly important with the rise of
Embodied AI, which requires 3D environments that are not only visually
realistic but also physically plausible and functionally diverse. While recent
approaches have advanced visual fidelity, they often remain constrained to
fixed scene categories, lack sufficient object-level detail and physical
consistency, and struggle to align with complex user instructions. In this
work, we present SceneWeaver, a reflective agentic framework that unifies
diverse scene synthesis paradigms through tool-based iterative refinement. At
its core, SceneWeaver employs a language model-based planner to select from a
suite of extensible scene generation tools, ranging from data-driven generative
models to visual- and LLM-based methods, guided by self-evaluation of physical
plausibility, visual realism, and semantic alignment with user input. This
closed-loop reason-act-reflect design enables the agent to identify semantic
inconsistencies, invoke targeted tools, and update the environment over
successive iterations. Extensive experiments on both common and open-vocabulary
room types demonstrate that SceneWeaver not only outperforms prior methods on
physical, visual, and semantic metrics, but also generalizes effectively to
complex scenes with diverse instructions, marking a step toward general-purpose
3D environment generation. Project website: https://scene-weaver.github.io/.