MeshAnything: オートリグレッシブ・トランスフォーマーを用いたアーティスト作成メッシュ生成

要旨

近年、再構築や生成によって作成された3Dアセットは、手作業で制作されたアセットと同等の品質に達し、その置換可能性が注目されています。しかし、この可能性はほとんど実現されていません。なぜなら、これらのアセットは常に3D産業アプリケーションで使用するためにメッシュに変換する必要があり、現在のメッシュ抽出手法で生成されるメッシュは、人間のアーティストが作成したメッシュ（Artist-Created Meshes, AMs）に比べて著しく劣っているからです。具体的には、現在のメッシュ抽出手法は密な面に依存し、幾何学的特徴を無視するため、非効率性、複雑な後処理、および低い表現品質を引き起こします。これらの問題を解決するため、我々はMeshAnythingを提案します。これは、メッシュ抽出を生成問題として扱い、指定された形状に整合したAMsを生成するモデルです。MeshAnythingは、任意の3D表現形式の3DアセットをAMsに変換することで、様々な3Dアセット生産手法と統合可能であり、3D産業全体での応用を強化します。MeshAnythingのアーキテクチャは、VQ-VAEと形状条件付きデコーダのみのトランスフォーマーで構成されています。まず、VQ-VAEを使用してメッシュの語彙を学習し、その後、この語彙に基づいて形状条件付きデコーダのみのトランスフォーマーを訓練し、形状条件付き自己回帰メッシュ生成を行います。我々の広範な実験により、本手法は数百倍少ない面数でAMsを生成し、ストレージ、レンダリング、シミュレーションの効率を大幅に向上させながら、従来手法と同等の精度を達成することが示されました。

English

Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industry applications, and the meshes produced by current mesh extraction methods are significantly inferior to Artist-Created Meshes (AMs), i.e., meshes created by human artists. Specifically, current mesh extraction methods rely on dense faces and ignore geometric features, leading to inefficiencies, complicated post-processing, and lower representation quality. To address these issues, we introduce MeshAnything, a model that treats mesh extraction as a generation problem, producing AMs aligned with specified shapes. By converting 3D assets in any 3D representation into AMs, MeshAnything can be integrated with various 3D asset production methods, thereby enhancing their application across the 3D industry. The architecture of MeshAnything comprises a VQ-VAE and a shape-conditioned decoder-only transformer. We first learn a mesh vocabulary using the VQ-VAE, then train the shape-conditioned decoder-only transformer on this vocabulary for shape-conditioned autoregressive mesh generation. Our extensive experiments show that our method generates AMs with hundreds of times fewer faces, significantly improving storage, rendering, and simulation efficiencies, while achieving precision comparable to previous methods.

MeshAnything: オートリグレッシブ・トランスフォーマーを用いたアーティスト作成メッシュ生成

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

要旨

Support