SAM 3D 動物：基於提示的野外動物三維重建

摘要

在野外进行3D动物重建仍具挑战性，原因包括物种差异大、频繁遮挡以及多动物场景的普遍存在，而现有方法主要集中于单动物设定。我们提出SAM 3D Animal——首个面向单张图像多动物3D重建的可提示框架。该方法基于SMAL+参数化动物模型，能够联合重建多个实例，并支持以关键点和遮罩形式提供的灵活提示，从而在拥挤与遮挡场景中实现更可靠的歧义消解。为训练此类模型，我们进一步引入Herd3D——一个包含超过5000张图像的多动物3D数据集，旨在增加物种、交互及遮挡模式的多样性。在Animal3D、APTv2和Animal Kingdom数据集上的实验表明，我们的框架在现有基于模型与无模型方法中均达到最优结果，为野外环境中提示驱动的动物3D重建提供了可扩展且有效的解决方案。

English

3D animal reconstruction in the wild remains challenging due to large species variation, frequent occlusions, and the prevalence of multi-animal scenes, while existing methods predominantly focus on single-animal settings. We present SAM 3D Animal, the first promptable framework for multi-animal 3D reconstruction from a single image. Built on the SMAL+ parametric animal model, our method jointly reconstructs multiple instances and supports flexible prompts in the form of keypoints and masks which enable more reliable disambiguation in crowded and occluded scenes. To train such a model, we further introduce Herd3D, a multi-animal 3D dataset containing over 5K images, designed to increase diversity in species, interactions, and occlusion patterns. Experiments on the Animal3D, APTv2, and Animal Kingdom datasets show that our framework achieves state-of-the-art results over both existing model-based and model-free methods, demonstrating a scalable and effective solution for prompt-driven animal 3D reconstruction in the wild.