学习网络的三维生态
Learning the 3D Fauna of the Web
January 4, 2024
作者: Zizhang Li, Dor Litvak, Ruining Li, Yunzhi Zhang, Tomas Jakab, Christian Rupprecht, Shangzhe Wu, Andrea Vedaldi, Jiajun Wu
cs.AI
摘要
学习地球上所有动物的3D模型需要大规模扩展现有解决方案。在这一最终目标的指导下,我们开发了3D-Fauna,这是一种方法,可以同时为100多种动物物种学习一个跨类别的可变形3D动物模型。建模动物的一个关键瓶颈是训练数据的有限可用性,我们通过简单地从2D互联网图像中学习来克服这一挑战。我们展示了先前针对特定类别的尝试无法推广到训练图像有限的稀有物种。我们通过引入“皮肤模型语义库”(SBSM)来解决这一挑战,该库通过将几何归纳先验与通过现成的自监督特征提取器隐式捕获的语义知识相结合,自动发现一小组基本动物形状。为了训练这样的模型,我们还贡献了一个新的大规模多样化动物物种数据集。在推断时,给定任何四足动物的单个图像,我们的模型能够在几秒钟内以前馈方式重建出一个关节式3D网格。
English
Learning 3D models of all animals on the Earth requires massively scaling up
existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an
approach that learns a pan-category deformable 3D animal model for more than
100 animal species jointly. One crucial bottleneck of modeling animals is the
limited availability of training data, which we overcome by simply learning
from 2D Internet images. We show that prior category-specific attempts fail to
generalize to rare species with limited training images. We address this
challenge by introducing the Semantic Bank of Skinned Models (SBSM), which
automatically discovers a small set of base animal shapes by combining
geometric inductive priors with semantic knowledge implicitly captured by an
off-the-shelf self-supervised feature extractor. To train such a model, we also
contribute a new large-scale dataset of diverse animal species. At inference
time, given a single image of any quadruped animal, our model reconstructs an
articulated 3D mesh in a feed-forward fashion within seconds.