Kinematify:高自由度關節化物體的開放詞彙合成
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects
November 3, 2025
作者: Jiawei Wang, Dingyou Wang, Jiaming Hu, Qixuan Zhang, Jingyi Yu, Lan Xu
cs.AI
摘要
對運動學結構與可動組件的深入理解,對於實現機器人操縱物體及建模自身關節化形態至關重要。這種理解透過關節化物件來體現,其在物理模擬、運動規劃與策略學習等任務中不可或缺。然而建立這類模型(特別是針對高自由度物件)仍是重大挑戰。現有方法通常依賴運動序列或人工標註數據集的強假設,這限制了其擴展性。本文提出Kinematify——能直接從任意RGB影像或文字描述自動生成關節化物件的框架。我們的方法解決兩大核心難題:(i)推斷高自由度物件的運動學拓撲結構;(ii)從靜態幾何體估算關節參數。為實現此目標,我們結合MCTS搜尋進行結構推論,並透過幾何驅動優化實現關節推理,最終生成物理一致且功能有效的描述。我們在合成與真實環境的多樣化輸入上評估Kinematify,結果顯示其在配準精度與運動學拓撲準確性方面均超越既有方法。
English
A deep understanding of kinematic structures and movable components is
essential for enabling robots to manipulate objects and model their own
articulated forms. Such understanding is captured through articulated objects,
which are essential for tasks such as physical simulation, motion planning, and
policy learning. However, creating these models, particularly for objects with
high degrees of freedom (DoF), remains a significant challenge. Existing
methods typically rely on motion sequences or strong assumptions from
hand-curated datasets, which hinders scalability. In this paper, we introduce
Kinematify, an automated framework that synthesizes articulated objects
directly from arbitrary RGB images or textual descriptions. Our method
addresses two core challenges: (i) inferring kinematic topologies for high-DoF
objects and (ii) estimating joint parameters from static geometry. To achieve
this, we combine MCTS search for structural inference with geometry-driven
optimization for joint reasoning, producing physically consistent and
functionally valid descriptions. We evaluate Kinematify on diverse inputs from
both synthetic and real-world environments, demonstrating improvements in
registration and kinematic topology accuracy over prior work.