ZeroShape: 回帰ベースのゼロショット形状再構築

要旨

単一画像からのゼロショット3D形状再構成の問題を研究する。最近の研究では、3Dアセットの生成モデリングを通じてゼロショット形状再構成を学習しているが、これらのモデルは学習時と推論時に計算コストが高い。一方、この問題に対する従来のアプローチは回帰ベースであり、決定論的モデルを訓練して直接オブジェクト形状を回帰させる。このような回帰手法は、生成手法よりもはるかに高い計算効率を有している。これにより、高性能を達成するために生成モデリングは必要か、あるいは逆に、回帰ベースのアプローチはまだ競争力があるのかという自然な疑問が生じる。この疑問に答えるため、我々はこの分野での収束する知見と新たな洞察に基づいて、ZeroShapeと呼ばれる強力な回帰ベースのモデルを設計する。また、3つの異なる実世界の3Dデータセットからオブジェクトを選び、大規模な実世界評価ベンチマークを構築した。この評価ベンチマークは、先行研究がモデルを定量的に評価するために使用するものよりも多様で、一桁大きい規模を有しており、我々の分野における評価のばらつきを低減することを目指している。我々は、ZeroShapeが最先端の手法を上回る性能を達成するだけでなく、計算効率とデータ効率が大幅に高いことを示す。

English

We study the problem of single-image zero-shot 3D shape reconstruction. Recent works learn zero-shot shape reconstruction through generative modeling of 3D assets, but these models are computationally expensive at train and inference time. In contrast, the traditional approach to this problem is regression-based, where deterministic models are trained to directly regress the object shape. Such regression methods possess much higher computational efficiency than generative methods. This raises a natural question: is generative modeling necessary for high performance, or conversely, are regression-based approaches still competitive? To answer this, we design a strong regression-based model, called ZeroShape, based on the converging findings in this field and a novel insight. We also curate a large real-world evaluation benchmark, with objects from three different real-world 3D datasets. This evaluation benchmark is more diverse and an order of magnitude larger than what prior works use to quantitatively evaluate their models, aiming at reducing the evaluation variance in our field. We show that ZeroShape not only achieves superior performance over state-of-the-art methods, but also demonstrates significantly higher computational and data efficiency.

ZeroShape: 回帰ベースのゼロショット形状再構築

ZeroShape: Regression-based Zero-shot Shape Reconstruction

要旨

Support