ChatPaper.aiChatPaper

EgoForce: 基于前臂引导的单目第一人称相机空间3D手部姿态估计

EgoForce: Forearm-Guided Camera-Space 3D Hand Pose from a Monocular Egocentric Camera

May 12, 2026
作者: Christen Millerdurai, Shaoxiang Wang, Yaxu Xie, Vladislav Golyanik, Didier Stricker, Alain Pagani
cs.AI

摘要

使用单目头戴式相机从用户视角重建手的绝对3D姿态和形状对于AR/VR、远程临场以及以手为中心的操控任务中实际化的自我中心交互至关重要,这些场景要求传感器保持紧凑且不引人注目。尽管单目RGB方法已取得进展,但仍受限于深度尺度模糊性,且难以泛化到头戴式设备的多样化光学配置。因此,模型通常需要在设备专用数据集上进行大量训练,而这类数据集的获取成本高昂且耗时。本文通过提出EgoForce框架来应对这些挑战,该框架是一个单目3D手部重建系统,能够从用户(相机空间)视角恢复鲁棒的绝对3D手部姿态及其位置。EgoForce采用单一统一网络,可适用于鱼眼、透视及畸变广角相机模型。我们的方法融合了可微分的前臂表示以稳定手部姿态、一个统一的臂-手Transformer以从单张自我中心视图预测手部和前臂几何结构(从而缓解深度尺度模糊性),以及一个射线空间闭式求解器以实现在不同头戴式相机模型下的绝对3D姿态恢复。在三个自我中心基准上的实验表明,EgoForce达到了最先进的3D精度,在HOT3D数据集上相比先前方法将相机空间MPJPE降低了最高28%,并在不同相机配置下保持了性能一致性。更多详情请访问项目页面:https://dfki-av.github.io/EgoForce。
English
Reconstructing the absolute 3D pose and shape of the hands from the user's viewpoint using a single head-mounted camera is crucial for practical egocentric interaction in AR/VR, telepresence, and hand-centric manipulation tasks, where sensing must remain compact and unobtrusive. While monocular RGB methods have made progress, they remain constrained by depth-scale ambiguity and struggle to generalize across the diverse optical configurations of head-mounted devices. As a result, models typically require extensive training on device-specific datasets, which are costly and laborious to acquire. This paper addresses these challenges by introducing EgoForce, a monocular 3D hand reconstruction framework that recovers robust, absolute 3D hand pose and its position from the user's (camera-space) viewpoint. EgoForce operates across fisheye, perspective, and distorted wide-FOV camera models using a single unified network. Our approach combines a differentiable forearm representation that stabilizes hand pose, a unified arm-hand transformer that predicts both hand and forearm geometry from a single egocentric view, mitigating depth-scale ambiguity, and a ray space closed-form solver that enables absolute 3D pose recovery across diverse head-mounted camera models. Experiments on three egocentric benchmarks show that EgoForce achieves state-of-the-art 3D accuracy, reducing camera-space MPJPE by up to 28% on the HOT3D dataset compared to prior methods and maintaining consistent performance across camera configurations. For more details, visit the project page at https://dfki-av.github.io/EgoForce.
PDF11May 14, 2026