ChatPaper.aiChatPaper

动态物体世界中的编舞艺术

Choreographing a World of Dynamic Objects

January 7, 2026
作者: Yanzhe Lyu, Chen Geng, Karthik Dharmarajan, Yunzhi Zhang, Hadi Alzayer, Shangzhe Wu, Jiajun Wu
cs.AI

摘要

在我们物理的4D(三维空间+时间)世界中,动态物体持续演化、形变并与其他物体交互,形成了多样化的四维场景动态。本文提出了一种通用生成框架CHORD,通过编排动态物体与场景来合成此类现象。传统基于规则的图形学流程虽能通过类别特定启发式方法创建这些动态,但过程费时费力且难以扩展。近期基于学习的方法通常需要大规模数据集,但可能无法覆盖所有目标物体类别。我们的方法通过提出基于蒸馏的流程,从二维视频的欧拉表示中提取隐藏的丰富拉格朗日运动信息,从而继承了视频生成模型的普适性。本方法具有通用性、多功能性且与物体类别无关。我们通过生成多样化多体四维动态的实验验证其有效性,展示其相较于现有方法的优势,并证明其在生成机器人操作策略方面的适用性。项目页面:https://yanzhelyu.github.io/chord
English
Dynamic objects in our physical 4D (3D + time) world are constantly evolving, deforming, and interacting with other objects, leading to diverse 4D scene dynamics. In this paper, we present a universal generative pipeline, CHORD, for CHOReographing Dynamic objects and scenes and synthesizing this type of phenomena. Traditional rule-based graphics pipelines to create these dynamics are based on category-specific heuristics, yet are labor-intensive and not scalable. Recent learning-based methods typically demand large-scale datasets, which may not cover all object categories in interest. Our approach instead inherits the universality from the video generative models by proposing a distillation-based pipeline to extract the rich Lagrangian motion information hidden in the Eulerian representations of 2D videos. Our method is universal, versatile, and category-agnostic. We demonstrate its effectiveness by conducting experiments to generate a diverse range of multi-body 4D dynamics, show its advantage compared to existing methods, and demonstrate its applicability in generating robotics manipulation policies. Project page: https://yanzhelyu.github.io/chord
PDF70January 9, 2026