SEEAvatar: 制約付きジオメトリとアピアランスによるフォトリアルなテキストから3Dアバター生成

要旨

大規模なテキストから画像生成モデルを活用したテキストから3Dアバター生成は、有望な進展を見せています。しかし、ほとんどの手法は、不正確な形状と低品質な外観に制限され、フォトリアルな結果を生成することに失敗しています。より実用的なアバター生成を目指して、本論文ではSEEAvatarを提案します。これは、形状と外観を分離した自己進化制約（SElf-Evolving constraints）を用いて、テキストからフォトリアルな3Dアバターを生成する手法です。形状に関しては、最適化されたアバターをテンプレートアバターを用いて適切なグローバル形状に制約します。テンプレートアバターは人間の事前知識で初期化され、最適化されたアバターによって定期的に更新される進化型テンプレートとして機能し、より柔軟な形状生成を可能にします。さらに、顔や手などの局所的な部分では、静的な人間の事前知識によって形状が制約され、繊細な構造が維持されます。外観生成に関しては、プロンプトエンジニアリングによって強化された拡散モデルを使用し、物理ベースのレンダリングパイプラインを導いてリアルなテクスチャを生成します。アルベドテクスチャには明度制約を適用し、誤った照明効果を抑制します。実験結果は、本手法がグローバルおよびローカルの形状と外観品質において、従来の手法を大きく上回ることを示しています。本手法は高品質なメッシュとテクスチャを生成できるため、これらのアセットは古典的なグラフィックスパイプラインに直接適用可能で、任意の照明条件下でのリアルなレンダリングが可能です。プロジェクトページは以下をご覧ください：https://seeavatar3d.github.io。

English

Powered by large-scale text-to-image generation models, text-to-3D avatar generation has made promising progress. However, most methods fail to produce photorealistic results, limited by imprecise geometry and low-quality appearance. Towards more practical avatar generation, we present SEEAvatar, a method for generating photorealistic 3D avatars from text with SElf-Evolving constraints for decoupled geometry and appearance. For geometry, we propose to constrain the optimized avatar in a decent global shape with a template avatar. The template avatar is initialized with human prior and can be updated by the optimized avatar periodically as an evolving template, which enables more flexible shape generation. Besides, the geometry is also constrained by the static human prior in local parts like face and hands to maintain the delicate structures. For appearance generation, we use diffusion model enhanced by prompt engineering to guide a physically based rendering pipeline to generate realistic textures. The lightness constraint is applied on the albedo texture to suppress incorrect lighting effect. Experiments show that our method outperforms previous methods on both global and local geometry and appearance quality by a large margin. Since our method can produce high-quality meshes and textures, such assets can be directly applied in classic graphics pipeline for realistic rendering under any lighting condition. Project page at: https://seeavatar3d.github.io.

SEEAvatar: 制約付きジオメトリとアピアランスによるフォトリアルなテキストから3Dアバター生成

SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance

要旨

Support