ChatPaper.aiChatPaper

Ponimator:展开交互姿态以实现多样化人-人互动动画

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

October 16, 2025
作者: Shaowei Liu, Chuan Guo, Bing Zhou, Jian Wang
cs.AI

摘要

近距离人体交互姿态蕴含了丰富的互动动态上下文信息。基于此类姿态,人类能够凭借对行为模式的深刻先验知识,直观推断情境并预测可能的过去与未来动态。受此启发,我们提出了Ponimator,一个以邻近交互姿态为锚点的多功能交互动画生成框架。我们的训练数据来源于动作捕捉交互数据集中的紧密接触双人姿态及其周边时序上下文。Ponimator利用交互姿态先验,采用两个条件扩散模型:(1) 姿态动画生成器,利用时序先验从交互姿态生成动态运动序列;(2) 姿态合成器,应用空间先验,在交互姿态缺失时,从单一姿态、文本或两者结合中合成交互姿态。综合而言,Ponimator支持多种任务,包括基于图像的交互动画生成、反应动画制作以及文本到交互的合成,有效促进了高质量动作捕捉数据中的交互知识向开放世界场景的迁移。跨多样数据集和应用的实证实验验证了姿态先验的普适性,以及我们框架的有效性和鲁棒性。
English
Close-proximity human-human interactive poses convey rich contextual information about interaction dynamics. Given such poses, humans can intuitively infer the context and anticipate possible past and future dynamics, drawing on strong priors of human behavior. Inspired by this observation, we propose Ponimator, a simple framework anchored on proximal interactive poses for versatile interaction animation. Our training data consists of close-contact two-person poses and their surrounding temporal context from motion-capture interaction datasets. Leveraging interactive pose priors, Ponimator employs two conditional diffusion models: (1) a pose animator that uses the temporal prior to generate dynamic motion sequences from interactive poses, and (2) a pose generator that applies the spatial prior to synthesize interactive poses from a single pose, text, or both when interactive poses are unavailable. Collectively, Ponimator supports diverse tasks, including image-based interaction animation, reaction animation, and text-to-interaction synthesis, facilitating the transfer of interaction knowledge from high-quality mocap data to open-world scenarios. Empirical experiments across diverse datasets and applications demonstrate the universality of the pose prior and the effectiveness and robustness of our framework.
PDF32December 21, 2025