邁向可擴展且一致的3D編輯
Towards Scalable and Consistent 3D Editing
October 3, 2025
作者: Ruihao Xia, Yang Tang, Pan Zhou
cs.AI
摘要
三維編輯——即對三維資產的幾何形狀或外觀進行局部修改的任務——在沉浸式內容創作、數字娛樂以及增強現實/虛擬現實(AR/VR)領域具有廣泛應用。然而,與二維編輯不同,三維編輯因需保持跨視圖一致性、結構保真度及細粒度可控性而面臨挑戰。現有方法往往速度緩慢,易產生幾何失真,或依賴於手動且精確的三維遮罩,這些遮罩既易出錯又不實用。為應對這些挑戰,我們在數據與模型兩方面均取得了進展。在數據方面,我們推出了3DEditVerse,這是迄今為止最大的配對三維編輯基準,包含116,309對高質量訓練樣本和1,500對精心挑選的測試樣本。通過姿態驅動的幾何編輯與基礎模型引導的外觀編輯相結合的管道構建,3DEditVerse確保了編輯的局部性、多視圖一致性及語義對齊。在模型方面,我們提出了3DEditFormer,這是一種保持三維結構的條件變換器。通過雙重引導注意力機制和時間自適應門控增強圖像到三維的生成過程,3DEditFormer將可編輯區域與保留結構分離,實現了無需輔助三維遮罩的精確且一致的編輯。大量實驗表明,我們的框架在定量與定性評估上均優於現有最先進的基準,為實用且可擴展的三維編輯設立了新標準。數據集與代碼將對外發布。項目詳情請訪問:https://www.lv-lab.org/3DEditFormer/
English
3D editing - the task of locally modifying the geometry or appearance of a 3D
asset - has wide applications in immersive content creation, digital
entertainment, and AR/VR. However, unlike 2D editing, it remains challenging
due to the need for cross-view consistency, structural fidelity, and
fine-grained controllability. Existing approaches are often slow, prone to
geometric distortions, or dependent on manual and accurate 3D masks that are
error-prone and impractical. To address these challenges, we advance both the
data and model fronts. On the data side, we introduce 3DEditVerse, the largest
paired 3D editing benchmark to date, comprising 116,309 high-quality training
pairs and 1,500 curated test pairs. Built through complementary pipelines of
pose-driven geometric edits and foundation model-guided appearance edits,
3DEditVerse ensures edit locality, multi-view consistency, and semantic
alignment. On the model side, we propose 3DEditFormer, a
3D-structure-preserving conditional transformer. By enhancing image-to-3D
generation with dual-guidance attention and time-adaptive gating, 3DEditFormer
disentangles editable regions from preserved structure, enabling precise and
consistent edits without requiring auxiliary 3D masks. Extensive experiments
demonstrate that our framework outperforms state-of-the-art baselines both
quantitatively and qualitatively, establishing a new standard for practical and
scalable 3D editing. Dataset and code will be released. Project:
https://www.lv-lab.org/3DEditFormer/