AnyMo: 几何感知且与设置无关的野外人体运动建模
AnyMo: Geometry-Aware Setup-Agnostic Modeling of Human Motion in the Wild
May 21, 2026
作者: Baiyu Chen, Zechen Li, Wilson Wongso, Lihuan Li, Xiachong Lin, Hao Xue, Benjamin Tag, Flora Salim
cs.AI
摘要
随着可穿戴和移动设备日益融入日常生活,它们提供了一种在野外连续感知人体运动的实用方法。然而,惯性信号高度依赖于传感设置,包括身体部位、佩戴位置、传感器朝向、设备硬件以及采样协议。这种设置依赖性使得学习能够跨设备和数据集迁移的运动表征变得困难,并限制了可穿戴惯性测量单元(IMU)在闭集识别之外的更广泛应用。我们提出了AnyMo——一种用于与设置无关的人体运动建模的几何感知框架。AnyMo利用基于物理的IMU模拟,在密集的体表位置上生成多样且合理的合成信号;通过配对合成放置视图和掩码局部观测,预训练一个图编码器;将多位置IMU信号标记化为全身运动词元,并将这些词元与大语言模型(LLM)对齐,以实现运动-语言理解。我们在三个互补任务上评估了AnyMo:在14个未见过的下游数据集上进行零样本活动识别、跨模态检索以及可穿戴IMU运动描述生成。在人体活动识别(HAR)任务上,平均准确率/F1分数/R@2分别提升了11.7%/11.6%/22.6%;零样本IMU到文本和文本到IMU检索的平均倒数排名(MRR)分别提升了15.9%和28.6%;零样本描述生成的BERT-F1分数提升了18.8%。这些结果支持AnyMo作为野外可穿戴运动理解的通用模型。项目页面:https://baiyuchen.com/project/AnyMo。
English
As wearable and mobile devices become increasingly embedded in daily life, they offer a practical way to continuously sense human motion in the wild. But inertial signals are highly dependent on the sensing setup, including body location, mounting position, sensor orientation, device hardware, and sampling protocol. This setup dependence makes it difficult to learn motion representations that transfer across devices and datasets, and limits the broader use of wearable IMUs beyond closed-set recognition. We introduce AnyMo, a geometry-aware framework for setup-agnostic human motion modeling. AnyMo uses physics-grounded IMU simulation over dense body-surface placements to generate diverse and plausible synthetic signals, pre-trains a graph encoder from paired synthetic placement views and masked partial observations, tokenizes multi-position IMU into full-body motion tokens, and aligns these tokens with an LLM for motion-language understanding. We evaluate AnyMo on three complementary tasks: zero-shot activity recognition across 14 unseen downstream datasets, cross-modal retrieval, and wearable IMU motion captioning, where it improves average Accuracy/F1/R@2 by 11.7\%/11.6\%/22.6\% on HAR, increases zero-shot IMU-to-text and text-to-IMU retrieval MRR by 15.9\% and 28.6\%, respectively, and improves zero-shot captioning BERT-F1 by 18.8\%. These results support AnyMo as a generalist model for wearable motion understanding in the wild. Project page: https://baiyuchen.com/project/AnyMo.