ウェブ上の3D動物相の学習

要旨

地球上のすべての動物の3Dモデルを学習するには、既存のソリューションを大規模にスケールアップする必要があります。この究極の目標を念頭に置いて、私たちは3D-Faunaを開発しました。これは、100種以上の動物種に対して汎カテゴリの変形可能な3D動物モデルを共同で学習するアプローチです。動物のモデリングにおける重要なボトルネックは、学習データの限られた可用性であり、これを2Dインターネット画像から単純に学習することで克服します。従来のカテゴリ固有の試みは、学習画像が限られた希少種に一般化できないことを示します。この課題に対処するために、幾何学的帰納的プライアと、既製の自己教師あり特徴抽出器によって暗黙的に捕捉された意味的知識を組み合わせることで、少数の基本動物形状を自動的に発見するSemantic Bank of Skinned Models（SBSM）を導入します。このようなモデルを学習するために、多様な動物種の大規模なデータセットも提供します。推論時には、任意の四足動物の単一画像が与えられると、私たちのモデルは数秒以内にフィードフォワード方式で関節付き3Dメッシュを再構築します。

English

Learning 3D models of all animals on the Earth requires massively scaling up existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data, which we overcome by simply learning from 2D Internet images. We show that prior category-specific attempts fail to generalize to rare species with limited training images. We address this challenge by introducing the Semantic Bank of Skinned Models (SBSM), which automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge implicitly captured by an off-the-shelf self-supervised feature extractor. To train such a model, we also contribute a new large-scale dataset of diverse animal species. At inference time, given a single image of any quadruped animal, our model reconstructs an articulated 3D mesh in a feed-forward fashion within seconds.

ウェブ上の3D動物相の学習

Learning the 3D Fauna of the Web

要旨

Support