學習網絡的3D生物群落
Learning the 3D Fauna of the Web
January 4, 2024
作者: Zizhang Li, Dor Litvak, Ruining Li, Yunzhi Zhang, Tomas Jakab, Christian Rupprecht, Shangzhe Wu, Andrea Vedaldi, Jiajun Wu
cs.AI
摘要
學習地球上所有動物的3D模型需要大幅擴展現有的解決方案。擁有這個終極目標,我們開發了3D-Fauna,這是一種方法,可以聯合學習100多種動物物種的全類別可變形3D動物模型。建模動物的一個關鍵瓶頸是訓練數據的有限可用性,我們通過簡單地從2D互聯網圖像中學習來克服這一挑戰。我們展示了先前的特定類別嘗試無法推廣到訓練圖像有限的稀有物種。我們通過引入皮膚模型語義庫(SBSM)來應對這一挑戰,該庫通過將幾何歸納先驗與自監督特徵提取器隱式捕獲的語義知識相結合,自動發現一小組基本動物形狀。為了訓練這樣的模型,我們還貢獻了一個新的大規模多樣動物物種數據集。在推斷時,給定任何四足動物的單張圖像,我們的模型可以在幾秒內以前向傳遞的方式重建一個有關節的3D網格。
English
Learning 3D models of all animals on the Earth requires massively scaling up
existing solutions. With this ultimate goal in mind, we develop 3D-Fauna, an
approach that learns a pan-category deformable 3D animal model for more than
100 animal species jointly. One crucial bottleneck of modeling animals is the
limited availability of training data, which we overcome by simply learning
from 2D Internet images. We show that prior category-specific attempts fail to
generalize to rare species with limited training images. We address this
challenge by introducing the Semantic Bank of Skinned Models (SBSM), which
automatically discovers a small set of base animal shapes by combining
geometric inductive priors with semantic knowledge implicitly captured by an
off-the-shelf self-supervised feature extractor. To train such a model, we also
contribute a new large-scale dataset of diverse animal species. At inference
time, given a single image of any quadruped animal, our model reconstructs an
articulated 3D mesh in a feed-forward fashion within seconds.