ChatPaper.aiChatPaper

皮肤令牌:一种用于统一自回归骨骼绑定的学习型紧凑表示

Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

February 4, 2026
作者: Jia-peng Zhang, Cheng-Feng Pu, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu
cs.AI

摘要

生成式3D模型的快速普及为动画制作流程带来了关键瓶颈——骨骼绑定。现有自动化方法从根本上受限于其蒙皮处理方式,将其视为不适定的高维回归任务,这种任务不仅优化效率低下,且通常与骨骼生成过程相分离。我们认为这是表征方式的问题,因此引入SkinTokens:一种通过学习得到的紧凑离散式蒙皮权重表征。通过利用FSQ-CVAE捕捉蒙皮固有的稀疏性,我们将任务框架从连续回归重构为更易处理的标记序列预测问题。该表征催生了TokenRig——一个将整个骨骼绑定系统建模为骨骼参数与SkinTokens单一序列的自回归框架,能够学习骨骼与蒙皮变形间的复杂依赖关系。这种统一模型随后可进入强化学习阶段,通过定制的几何与语义奖励机制提升对复杂分布外资产的泛化能力。量化数据显示,SkinTokens表征使蒙皮精度相较现有最优方法提升98%-133%,而经过RL优化的完整TokenRig框架则将骨骼预测效果提升17%-22%。本研究提出的统一生成式骨骼绑定方案兼具高保真度与强鲁棒性,为3D内容创作领域的长期挑战提供了可扩展的解决方案。
English
The rapid proliferation of generative 3D models has created a critical bottleneck in animation pipelines: rigging. Existing automated methods are fundamentally limited by their approach to skinning, treating it as an ill-posed, high-dimensional regression task that is inefficient to optimize and is typically decoupled from skeleton generation. We posit this is a representation problem and introduce SkinTokens: a learned, compact, and discrete representation for skinning weights. By leveraging an FSQ-CVAE to capture the intrinsic sparsity of skinning, we reframe the task from continuous regression to a more tractable token sequence prediction problem. This representation enables TokenRig, a unified autoregressive framework that models the entire rig as a single sequence of skeletal parameters and SkinTokens, learning the complicated dependencies between skeletons and skin deformations. The unified model is then amenable to a reinforcement learning stage, where tailored geometric and semantic rewards improve generalization to complex, out-of-distribution assets. Quantitatively, the SkinTokens representation leads to a 98%-133% percents improvement in skinning accuracy over state-of-the-art methods, while the full TokenRig framework, refined with RL, enhances bone prediction by 17%-22%. Our work presents a unified, generative approach to rigging that yields higher fidelity and robustness, offering a scalable solution to a long-standing challenge in 3D content creation.
PDF31February 6, 2026