LRM-Zero:使用合成資料訓練大型重建模型
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
June 13, 2024
作者: Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie Kaufman, Xin Sun, Hao Tan
cs.AI
摘要
我們提出了LRM-Zero,一個完全基於合成的3D數據訓練的大型重建模型(LRM),實現了高質量的稀疏視圖3D重建。LRM-Zero的核心是我們的程序化3D數據集Zeroverse,它是從簡單的基本形狀自動合成的,具有隨機紋理和增強(例如,高度場、布爾差異和線框)。與以前的3D數據集(例如Objaverse)不同,它們通常是由人類捕捉或製作以逼真模擬真實3D數據不同,Zeroverse完全忽略了現實全局語義,但在幾何和紋理細節上豐富,這些細節在局部上與真實物體相似甚至更為複雜。我們展示了我們的LRM-Zero,通過我們完全合成的Zeroverse訓練,可以實現對真實世界物體的高視覺質量重建,與在Objaverse上訓練的模型相競爭。我們還分析了Zeroverse的幾個關鍵設計選擇,這些選擇有助於LRM-Zero的能力和訓練穩定性。我們的工作表明,3D重建,作為3D視覺中的核心任務之一,有可能在不考慮真實世界物體語義的情況下進行。Zeroverse的程序合成代碼和互動可視化可在以下網址找到:https://desaixie.github.io/lrm-zero/。
English
We present LRM-Zero, a Large Reconstruction Model (LRM) trained entirely on
synthesized 3D data, achieving high-quality sparse-view 3D reconstruction. The
core of LRM-Zero is our procedural 3D dataset, Zeroverse, which is
automatically synthesized from simple primitive shapes with random texturing
and augmentations (e.g., height fields, boolean differences, and wireframes).
Unlike previous 3D datasets (e.g., Objaverse) which are often captured or
crafted by humans to approximate real 3D data, Zeroverse completely ignores
realistic global semantics but is rich in complex geometric and texture details
that are locally similar to or even more intricate than real objects. We
demonstrate that our LRM-Zero, trained with our fully synthesized Zeroverse,
can achieve high visual quality in the reconstruction of real-world objects,
competitive with models trained on Objaverse. We also analyze several critical
design choices of Zeroverse that contribute to LRM-Zero's capability and
training stability. Our work demonstrates that 3D reconstruction, one of the
core tasks in 3D vision, can potentially be addressed without the semantics of
real-world objects. The Zeroverse's procedural synthesis code and interactive
visualization are available at: https://desaixie.github.io/lrm-zero/.Summary
AI-Generated Summary