ChatPaper.aiChatPaper

皮肤令牌:一种用于统一自回归骨骼绑定的学习型紧凑表征

Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

February 4, 2026
作者: Jia-peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu
cs.AI

摘要

生成式3D模型的快速普及使蒙皮绑定成为动画流程中的关键瓶颈。现有自动化方法受限于其蒙皮权重处理方式——将其视为不适定的高维回归任务,这种优化效率低下且通常与骨骼生成解耦。我们认为这是表征方式的问题,因此提出SkinTokens:一种习得的紧凑离散式蒙皮权重表征。通过利用FSQ-CVAE捕捉蒙皮固有的稀疏性,我们将任务从连续回归重构为更易处理的标记序列预测问题。该表征催生了TokenRig框架,这个统一的自回归框架将整个绑定系统建模为骨骼参数与SkinTokens的单一序列,从而学习骨骼与蒙皮变形间的复杂依赖关系。统一模型随后可进入强化学习阶段,通过定制的几何与语义奖励提升对复杂分布外资产的泛化能力。量化实验表明,SkinTokens表征将蒙皮精度较现有最优方法提升98%-133%,而经过RL优化的完整TokenRig框架则将骨骼预测效果提升17%-22%。本研究提出了一种统一的生成式绑定方案,在保证高保真度与鲁棒性的同时,为3D内容创作中的长期挑战提供了可扩展的解决方案。
English
The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations. The unified model is then amenable to a reinforcement learning stage, where tailored geometric and semantic rewards improve generalization to complex, out-of-distribution assets. Quantitatively, the SkinTokens representation leads to a 98%-133% percents improvement in skinning accuracy over state-of-the-art methods, while the full TokenRig framework, refined with RL, enhances bone prediction by 17%-22%. Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness, offering a scalable solution to a long-standing challenge in 3D content creation.
PDF31February 6, 2026