GRM:大型高斯重建模型用于高效的3D重建和生成
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
March 21, 2024
作者: Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein
cs.AI
摘要
我们介绍了GRM,一个大规模的重建器,能够在约0.1秒内从稀疏视图图像中恢复3D资产。GRM是一个基于前馈变压器的模型,能够高效地整合多视图信息,将输入像素转换为像素对齐的高斯分布,然后将其反投影,创建一组代表场景的密集分布的3D高斯分布。我们的变压器架构和使用3D高斯分布共同解锁了可扩展且高效的重建框架。大量实验结果表明,我们的方法在重建质量和效率方面优于其他替代方案。我们还展示了GRM在生成任务中的潜力,即文本到3D和图像到3D,通过将其与现有的多视图扩散模型集成。我们的项目网站位于:https://justimyhxu.github.io/projects/grm/.
English
We introduce GRM, a large-scale reconstructor capable of recovering a 3D
asset from sparse-view images in around 0.1s. GRM is a feed-forward
transformer-based model that efficiently incorporates multi-view information to
translate the input pixels into pixel-aligned Gaussians, which are unprojected
to create a set of densely distributed 3D Gaussians representing a scene.
Together, our transformer architecture and the use of 3D Gaussians unlock a
scalable and efficient reconstruction framework. Extensive experimental results
demonstrate the superiority of our method over alternatives regarding both
reconstruction quality and efficiency. We also showcase the potential of GRM in
generative tasks, i.e., text-to-3D and image-to-3D, by integrating it with
existing multi-view diffusion models. Our project website is at:
https://justimyhxu.github.io/projects/grm/.Summary
AI-Generated Summary