ChatPaper.aiChatPaper

迈向可扩展且一致的3D编辑

Towards Scalable and Consistent 3D Editing

October 3, 2025
作者: Ruihao Xia, Yang Tang, Pan Zhou
cs.AI

摘要

三维编辑——即对三维资产的几何形状或外观进行局部修改的任务——在沉浸式内容创作、数字娱乐以及增强现实/虚拟现实(AR/VR)领域有着广泛的应用。然而,与二维编辑不同,由于需要跨视角一致性、结构保真度以及细粒度可控性,三维编辑仍面临诸多挑战。现有方法往往速度缓慢,易产生几何失真,或依赖于手动且精确的三维遮罩,这些遮罩不仅容易出错,而且不切实际。为应对这些挑战,我们在数据和模型两方面均取得了进展。在数据方面,我们推出了3DEditVerse,这是迄今为止最大的配对三维编辑基准,包含116,309对高质量训练样本和1,500对精心挑选的测试样本。通过姿态驱动的几何编辑与基础模型引导的外观编辑相结合的互补流程构建,3DEditVerse确保了编辑的局部性、多视角一致性及语义对齐。在模型方面,我们提出了3DEditFormer,一种保持三维结构条件的Transformer模型。通过双引导注意力机制和时间自适应门控增强图像到三维的生成过程,3DEditFormer将可编辑区域与保留结构分离,实现了无需辅助三维遮罩的精确且一致的编辑。大量实验表明,我们的框架在定量和定性上均超越了现有最先进的基线方法,为实用且可扩展的三维编辑设立了新标准。数据集与代码将公开发布。项目地址:https://www.lv-lab.org/3DEditFormer/
English
3D editing - the task of locally modifying the geometry or appearance of a 3D asset - has wide applications in immersive content creation, digital entertainment, and AR/VR. However, unlike 2D editing, it remains challenging due to the need for cross-view consistency, structural fidelity, and fine-grained controllability. Existing approaches are often slow, prone to geometric distortions, or dependent on manual and accurate 3D masks that are error-prone and impractical. To address these challenges, we advance both the data and model fronts. On the data side, we introduce 3DEditVerse, the largest paired 3D editing benchmark to date, comprising 116,309 high-quality training pairs and 1,500 curated test pairs. Built through complementary pipelines of pose-driven geometric edits and foundation model-guided appearance edits, 3DEditVerse ensures edit locality, multi-view consistency, and semantic alignment. On the model side, we propose 3DEditFormer, a 3D-structure-preserving conditional transformer. By enhancing image-to-3D generation with dual-guidance attention and time-adaptive gating, 3DEditFormer disentangles editable regions from preserved structure, enabling precise and consistent edits without requiring auxiliary 3D masks. Extensive experiments demonstrate that our framework outperforms state-of-the-art baselines both quantitatively and qualitatively, establishing a new standard for practical and scalable 3D editing. Dataset and code will be released. Project: https://www.lv-lab.org/3DEditFormer/
PDF12October 10, 2025