PhysForge:為互動式虛擬世界生成物理基礎的3D資產
PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World
May 6, 2026
作者: Yunhan Yang, Chunshi Wang, Junliang Ye, Yang Li, Zanxin Chen, Zehuan Huang, Yao Mu, Zhuo Chen, Chunchao Guo, Xihui Liu
cs.AI
摘要
基於物理原理合成三維資產是建構互動虛擬世界與具身智能體的關鍵瓶頸。現有方法主要聚焦於靜態幾何特徵,忽略了互動所需的功能性屬性。我們主張互動資產生成必須植根於功能邏輯與層次化物理特性。為此,我們提出PhysForge——一個由大規模物理註釋數據集PhysDB(包含15萬個具四級物理註釋的資產)支持的解耦雙階段框架。首先,視覺語言模型擔任「物理架構師」,規劃定義材料屬性、功能約束與運動學條件的「層次化物理藍圖」;其次,基於物理的擴散模型通過創新的運動體素注入機制,在生成高保真幾何結構的同時精確合成運動學參數。實驗表明,PhysForge能產出功能合理且可直接用於物理模擬的資產,為互動三維內容與具身智能體提供了強健的數據引擎。
English
Synthesizing physics-grounded 3D assets is a critical bottleneck for interactive virtual worlds and embodied AI. Existing methods predominantly focus on static geometry, overlooking the functional properties essential for interaction. We propose that interactive asset generation must be rooted in functional logic and hierarchical physics. To bridge this gap, we introduce PhysForge, a decoupled two-stage framework supported by PhysDB, a large-scale dataset of 150,000 assets with four-tier physical annotations. First, a VLM acts as a "physical architect" to plan a "Hierarchical Physical Blueprint" defining material, functional, and kinematic constraints. Second, a physics-grounded diffusion model realizes this blueprint by synthesizing high-fidelity geometry alongside precise kinematic parameters via a novel KineVoxel Injection (KVI) mechanism. Experiments demonstrate that PhysForge produces functionally plausible, simulation-ready assets, providing a robust data engine for interactive 3D content and embodied agents.