從影片中重建可動畫類別
Reconstructing Animatable Categories from Videos
May 10, 2023
作者: Gengshan Yang, Chaoyang Wang, N Dinesh Reddy, Deva Ramanan
cs.AI
摘要
建立可動畫的3D模型具有挑戰性,因為需要進行3D掃描、繁瑣的註冊和手動設置骨骼,這些難以擴展到任意類別。最近,可微渲染提供了從單眼視頻中獲得高質量3D模型的途徑,但這些僅限於剛性類別或單個實例。我們提出了RAC,從單眼視頻中構建類別3D模型,同時將實例變化和時間運動分離。為解決此問題,引入了三個關鍵思想:(1)通過優化將骨架專門化到實例,(2)一種潛在空間正則化方法,鼓勵跨類別保持共享結構,同時保留實例細節,以及(3)使用3D背景模型將對象與背景分離。我們展示可以從50-100個互聯網視頻中學習人類、貓和狗的3D模型。
English
Building animatable 3D models is challenging due to the need for 3D scans,
laborious registration, and manual rigging, which are difficult to scale to
arbitrary categories. Recently, differentiable rendering provides a pathway to
obtain high-quality 3D models from monocular videos, but these are limited to
rigid categories or single instances. We present RAC that builds category 3D
models from monocular videos while disentangling variations over instances and
motion over time. Three key ideas are introduced to solve this problem: (1)
specializing a skeleton to instances via optimization, (2) a method for latent
space regularization that encourages shared structure across a category while
maintaining instance details, and (3) using 3D background models to disentangle
objects from the background. We show that 3D models of humans, cats, and dogs
can be learned from 50-100 internet videos.