ChatPaper.aiChatPaper

DragMesh:輕鬆實現互動式3D生成

DragMesh: Interactive 3D Generation Made Easy

December 6, 2025
作者: Tianshan Zhang, Zeyu Zhang, Hao Tang
cs.AI

摘要

雖然生成模型在創建靜態3D內容方面表現卓越,但如何讓系統理解物體運動方式並對互動產生反應,仍是根本性挑戰。當前關節運動的技術路線正處於十字路口:要麼遵循物理一致性但速度過慢無法實時應用,要麼具備生成能力卻違背基本運動學約束。我們提出DragMesh——一個圍繞輕量級運動生成核心構建的實時交互式3D關節運動框架。我們的核心創新在於新穎的解耦式運動學推理與運動生成架構:首先通過分離語義意圖推理(由運動學預測網絡KPP-Net判定關節類型)與幾何回歸(確定軸向與原點)來推斷潛在關節參數;其次利用對偶四元數表示剛體運動的緊湊性、連續性和無奇異性特性,開發了新型對偶四元數變分自編碼器(DQ-VAE)。該DQ-VAE接收預測先驗與用戶拖拽指令,生成完整合理的運動軌跡。為確保嚴格遵循運動學約束,我們通過FiLM條件調製在DQ-VAE非自回歸Transformer解碼器的每一層注入關節先驗,這種持續多尺度指導輔以數值穩定的叉積損失來保證軸向對齊。此解耦設計使DragMesh實現實時性能,並能對未經訓練的新物體進行合理生成式關節運動,為生成式3D智能邁出實用化一步。代碼與項目網站詳見:https://github.com/AIGeeksGroup/DragMesh 與 https://aigeeksgroup.github.io/DragMesh。
English
While generative models have excelled at creating static 3D content, the pursuit of systems that understand how objects move and respond to interactions remains a fundamental challenge. Current methods for articulated motion lie at a crossroads: they are either physically consistent but too slow for real-time use, or generative but violate basic kinematic constraints. We present DragMesh, a robust framework for real-time interactive 3D articulation built around a lightweight motion generation core. Our core contribution is a novel decoupled kinematic reasoning and motion generation framework. First, we infer the latent joint parameters by decoupling semantic intent reasoning (which determines the joint type) from geometric regression (which determines the axis and origin using our Kinematics Prediction Network (KPP-Net)). Second, to leverage the compact, continuous, and singularity-free properties of dual quaternions for representing rigid body motion, we develop a novel Dual Quaternion VAE (DQ-VAE). This DQ-VAE receives these predicted priors, along with the original user drag, to generate a complete, plausible motion trajectory. To ensure strict adherence to kinematics, we inject the joint priors at every layer of the DQ-VAE's non-autoregressive Transformer decoder using FiLM (Feature-wise Linear Modulation) conditioning. This persistent, multi-scale guidance is complemented by a numerically-stable cross-product loss to guarantee axis alignment. This decoupled design allows DragMesh to achieve real-time performance and enables plausible, generative articulation on novel objects without retraining, offering a practical step toward generative 3D intelligence. Code: https://github.com/AIGeeksGroup/DragMesh. Website: https://aigeeksgroup.github.io/DragMesh.
PDF12February 27, 2026