Turbo3D: 超高速テキストから3D生成

要旨

Turbo3Dは、1秒未満で高品質なガウススプラッティングアセットを生成できる超高速のテキストから3Dシステムです。Turbo3Dは、急速な4段階4ビュー拡散ジェネレーターと効率的なフィードフォワードガウス再構築器を、いずれも潜在空間で動作させています。4段階4ビュージェネレーターは、新しいデュアルティーチャーアプローチを通じて蒸留されたスチューデントモデルであり、このアプローチにより、マルチビューティーチャーからのビューの一貫性とシングルビューティーチャーからの写実性を学ぶように促します。ガウス再構築器の入力をピクセル空間から潜在空間にシフトすることで、余分な画像デコード時間を排除し、最大効率を実現するためにトランスフォーマーシーケンスの長さを半分にします。当社の手法は、以前のベースラインと比較して優れた3D生成結果を示し、それらのランタイムの一部で動作します。

English

We present Turbo3D, an ultra-fast text-to-3D system capable of generating high-quality Gaussian splatting assets in under one second. Turbo3D employs a rapid 4-step, 4-view diffusion generator and an efficient feed-forward Gaussian reconstructor, both operating in latent space. The 4-step, 4-view generator is a student model distilled through a novel Dual-Teacher approach, which encourages the student to learn view consistency from a multi-view teacher and photo-realism from a single-view teacher. By shifting the Gaussian reconstructor's inputs from pixel space to latent space, we eliminate the extra image decoding time and halve the transformer sequence length for maximum efficiency. Our method demonstrates superior 3D generation results compared to previous baselines, while operating in a fraction of their runtime.

Turbo3D: 超高速テキストから3D生成

Turbo3D: Ultra-fast Text-to-3D Generation

要旨

Support