Vista3D: 1枚の画像から3Dのダークサイドを解き明かす

要旨

我々は古くからの探求に乗り出す：物体の隠れた次元を、見える部分のほんの一部から垣間見る。この課題に取り組むために、Vista3Dというフレームワークを提案する。Vista3Dは、わずか5分で迅速かつ一貫した3D生成を実現する。Vista3Dの中心には、粗い段階と細かい段階がある。粗い段階では、単一の画像からガウススプラッティングを用いて初期ジオメトリを迅速に生成する。細かい段階では、学習されたガウススプラッティングから直接符号付き距離関数（SDF）を抽出し、異なる可能な等値面表現で最適化する。さらに、可視および隠れた物体の両方の側面を捉えるために、2つの独立した暗黙の関数を使用した分離表現を用いて生成の品質を向上させる。さらに、2D拡散事前確率からの勾配を3D感知拡散事前確率と角度拡散事前確率の組み合わせによって調和させる。幅広い評価を通じて、Vista3Dが生成された3D物体の一貫性と多様性のバランスを効果的に維持することを示す。デモとコードは、https://github.com/florinshen/Vista3D で入手可能となります。

English

We embark on the age-old quest: unveiling the hidden dimensions of objects from mere glimpses of their visible parts. To address this, we present Vista3D, a framework that realizes swift and consistent 3D generation within a mere 5 minutes. At the heart of Vista3D lies a two-phase approach: the coarse phase and the fine phase. In the coarse phase, we rapidly generate initial geometry with Gaussian Splatting from a single image. In the fine phase, we extract a Signed Distance Function (SDF) directly from learned Gaussian Splatting, optimizing it with a differentiable isosurface representation. Furthermore, it elevates the quality of generation by using a disentangled representation with two independent implicit functions to capture both visible and obscured aspects of objects. Additionally, it harmonizes gradients from 2D diffusion prior with 3D-aware diffusion priors by angular diffusion prior composition. Through extensive evaluation, we demonstrate that Vista3D effectively sustains a balance between the consistency and diversity of the generated 3D objects. Demos and code will be available at https://github.com/florinshen/Vista3D.

Vista3D: 1枚の画像から3Dのダークサイドを解き明かす

Vista3D: Unravel the 3D Darkside of a Single Image

要旨

Support