《万物定向V2:朝向与旋转理解的统一框架》
Orient Anything V2: Unifying Orientation and Rotation Understanding
January 9, 2026
作者: Zehan Wang, Ziang Zhang, Jiayang Xu, Jialei Wang, Tianyu Pang, Chao Du, HengShuang Zhao, Zhou Zhao
cs.AI
摘要
本研究推出Orient Anything V2——一个增强型基础模型,能够通过单张或成对图像实现物体三维朝向与旋转的统一理解。该模型在Orient Anything V1(通过单一独特正面定义朝向)的基础上,拓展至处理具有不同旋转对称性的物体,并能直接估算相对旋转。这一提升得益于四项关键创新:1)利用生成模型合成可扩展的3D资产,确保广泛类别覆盖与均衡数据分布;2)采用高效的模型在环标注系统,鲁棒识别每个物体0到N个有效正面;3)提出感知对称性的周期性分布拟合目标,捕捉所有合理正面朝向,有效建模物体旋转对称性;4)设计可直接预测物体相对旋转的多帧架构。大量实验表明,Orient Anything V2在11个主流基准测试中,于朝向估计、六自由度姿态估计和物体对称性识别任务上实现了零样本状态最优性能。该模型展现出强大泛化能力,显著拓宽了朝向估计在多样化下游任务中的适用边界。
English
This work presents Orient Anything V2, an enhanced foundation model for unified understanding of object 3D orientation and rotation from single or paired images. Building upon Orient Anything V1, which defines orientation via a single unique front face, V2 extends this capability to handle objects with diverse rotational symmetries and directly estimate relative rotations. These improvements are enabled by four key innovations: 1) Scalable 3D assets synthesized by generative models, ensuring broad category coverage and balanced data distribution; 2) An efficient, model-in-the-loop annotation system that robustly identifies 0 to N valid front faces for each object; 3) A symmetry-aware, periodic distribution fitting objective that captures all plausible front-facing orientations, effectively modeling object rotational symmetry; 4) A multi-frame architecture that directly predicts relative object rotations. Extensive experiments show that Orient Anything V2 achieves state-of-the-art zero-shot performance on orientation estimation, 6DoF pose estimation, and object symmetry recognition across 11 widely used benchmarks. The model demonstrates strong generalization, significantly broadening the applicability of orientation estimation in diverse downstream tasks.