人間の動作のアンラーニング

要旨

我々は、一般的なテキストからモーション生成の性能を維持しつつ、有害なアニメーションの合成を防ぐための「人間のモーションのアンラーニング」タスクを提案する。有害なモーションのアンラーニングは、明示的なテキストプロンプトから生成される場合や、安全なモーションの組み合わせから暗黙的に生成される場合（例えば、「蹴る」は「脚を振り上げて振り下ろす」）があるため、課題が多い。我々は、大規模で最新のテキストからモーションデータセットであるHumanML3DとMotion-Xから有害なモーションをフィルタリングすることで、初めてのモーションアンラーニングベンチマークを提案する。また、最先端の画像アンラーニング技術を時空間信号処理に適応させたベースラインを提案する。さらに、Latent Code Replacement（LCR）と呼ばれる新しいモーションアンラーニングモデルを提案する。LCRはトレーニング不要であり、最先端のテキストからモーションディフュージョンモデルの離散潜在空間に適している。LCRはシンプルであり、定性的および定量的にベースラインを一貫して上回る。プロジェクトページ: https://www.pinlab.org/hmu{https://www.pinlab.org/hmu}。

English

We introduce the task of human motion unlearning to prevent the synthesis of toxic animations while preserving the general text-to-motion generative performance. Unlearning toxic motions is challenging as those can be generated from explicit text prompts and from implicit toxic combinations of safe motions (e.g., ``kicking" is ``loading and swinging a leg"). We propose the first motion unlearning benchmark by filtering toxic motions from the large and recent text-to-motion datasets of HumanML3D and Motion-X. We propose baselines, by adapting state-of-the-art image unlearning techniques to process spatio-temporal signals. Finally, we propose a novel motion unlearning model based on Latent Code Replacement, which we dub LCR. LCR is training-free and suitable to the discrete latent spaces of state-of-the-art text-to-motion diffusion models. LCR is simple and consistently outperforms baselines qualitatively and quantitatively. Project page: https://www.pinlab.org/hmu{https://www.pinlab.org/hmu}.

人間の動作のアンラーニング

Human Motion Unlearning

要旨

Support