黎曼运动生成：基于黎曼流匹配的人类运动表征与生成统一框架

摘要

尽管有效运动遵循结构化非欧几里得几何特性，人体运动生成技术通常仍在欧氏空间中进行学习。我们提出黎曼运动生成框架，该统一框架在乘积流形上表示运动，并通过黎曼流匹配学习动力学。RMG将运动分解为多个流形因子，生成具有内在归一化特性的无尺度表示，并采用测地线插值、切空间监督及流形保持常微分方程积分进行训练与采样。在HumanML3D数据集上，RMG以HumanML3D格式实现了最先进的FID指标（0.043），并在MotionStreamer格式下所有报告指标中排名第一。在MotionMillion数据集上，其表现亦超越强基线模型（FID 5.6，R@1 0.86）。消融实验表明，紧凑的T+R（平移+旋转）表示是最稳定有效的方案，印证了几何感知建模是实现高保真运动生成的实用且可扩展路径。

English

Human motion generation is often learned in Euclidean spaces, although valid motions follow structured non-Euclidean geometry. We present Riemannian Motion Generation (RMG), a unified framework that represents motion on a product manifold and learns dynamics via Riemannian flow matching. RMG factorizes motion into several manifold factors, yielding a scale-free representation with intrinsic normalization, and uses geodesic interpolation, tangent-space supervision, and manifold-preserving ODE integration for training and sampling. On HumanML3D, RMG achieves state-of-the-art FID in the HumanML3D format (0.043) and ranks first on all reported metrics under the MotionStreamer format. On MotionMillion, it also surpasses strong baselines (FID 5.6, R@1 0.86). Ablations show that the compact T+R (translation + rotations) representation is the most stable and effective, highlighting geometry-aware modeling as a practical and scalable route to high-fidelity motion generation.