ChatPaper.aiChatPaper

Ultra3D:基於部件注意力機制的高效高保真三維生成

Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

July 23, 2025
作者: Yiwen Chen, Zhihao Li, Yikai Wang, Hu Zhang, Qin Li, Chi Zhang, Guosheng Lin
cs.AI

摘要

近期在稀疏體素表示方面的進展顯著提升了三維內容生成的質量,使得高分辨率建模與細粒度幾何成為可能。然而,現有框架因其兩階段擴散管道中注意力機制的二次方複雜度而面臨嚴重的計算效率低下問題。在本研究中,我們提出了Ultra3D,這是一個高效的三維生成框架,能夠在不犧牲質量的前提下大幅加速稀疏體素建模。我們的方法利用緊湊的VecSet表示在第一階段高效生成粗略的物體佈局,從而減少令牌數量並加速體素座標預測。為了在第二階段精煉每個體素的潛在特徵,我們引入了部分注意力,這是一種幾何感知的局部化注意力機制,將注意力計算限制在語義一致的部分區域內。這一設計在保持結構連續性的同時避免了不必要的全局注意力,實現了潛在生成速度最高達6.7倍的提升。為了支持這一機制,我們構建了一個可擴展的部分註釋管道,將原始網格轉換為帶有部分標籤的稀疏體素。大量實驗表明,Ultra3D支持1024分辨率的高分辨率三維生成,並在視覺保真度和用戶偏好方面達到了業界領先水平。
English
Recent advances in sparse voxel representations have significantly improved the quality of 3D content generation, enabling high-resolution modeling with fine-grained geometry. However, existing frameworks suffer from severe computational inefficiencies due to the quadratic complexity of attention mechanisms in their two-stage diffusion pipelines. In this work, we propose Ultra3D, an efficient 3D generation framework that significantly accelerates sparse voxel modeling without compromising quality. Our method leverages the compact VecSet representation to efficiently generate a coarse object layout in the first stage, reducing token count and accelerating voxel coordinate prediction. To refine per-voxel latent features in the second stage, we introduce Part Attention, a geometry-aware localized attention mechanism that restricts attention computation within semantically consistent part regions. This design preserves structural continuity while avoiding unnecessary global attention, achieving up to 6.7x speed-up in latent generation. To support this mechanism, we construct a scalable part annotation pipeline that converts raw meshes into part-labeled sparse voxels. Extensive experiments demonstrate that Ultra3D supports high-resolution 3D generation at 1024 resolution and achieves state-of-the-art performance in both visual fidelity and user preference.
PDF301July 24, 2025