Story-to-Motion: Het Synthetiseren van Oneindige en Bestuurbare Karakteranimatie uit Lange Tekst

Samenvatting

Het genereren van natuurlijke menselijke bewegingen vanuit een verhaal heeft het potentieel om het landschap van de animatie-, gaming- en filmindustrie te transformeren. Een nieuwe en uitdagende taak, Story-to-Motion, ontstaat wanneer personages naar verschillende locaties moeten bewegen en specifieke bewegingen moeten uitvoeren op basis van een lange tekstbeschrijving. Deze taak vereist een combinatie van laag-niveau controle (trajectories) en hoog-niveau controle (bewegingssemantiek). Eerdere werken op het gebied van karaktercontrole en tekst-naar-beweging hebben gerelateerde aspecten aangepakt, maar een uitgebreide oplossing blijft ongrijpbaar: methoden voor karaktercontrole hanteren geen tekstbeschrijving, terwijl tekst-naar-beweging methoden positiebeperkingen missen en vaak instabiele bewegingen produceren. Gezien deze beperkingen stellen we een nieuw systeem voor dat controleerbare, oneindig lange bewegingen en trajectories genereert die zijn afgestemd op de invoertekst. (1) We maken gebruik van hedendaagse Large Language Models om te fungeren als een tekstgestuurde bewegingplanner om een reeks (tekst, positie, duur) paren uit lange tekst te extraheren. (2) We ontwikkelen een tekstgestuurd bewegingretrievalschema dat bewegingmatching combineert met beweging semantiek en trajectbeperkingen. (3) We ontwerpen een progressieve mask transformer die veelvoorkomende artefacten in de overgangsbeweging aanpakt, zoals onnatuurlijke houdingen en voetglijden. Naast zijn baanbrekende rol als de eerste uitgebreide oplossing voor Story-to-Motion, ondergaat ons systeem evaluatie over drie verschillende sub-taken: trajectvolging, temporele actiecompositie en bewegingblending, waar het de vorige state-of-the-art beweging synthesemethoden overtreft. Homepage: https://story2motion.github.io/.

English

Generating natural human motion from a story has the potential to transform the landscape of animation, gaming, and film industries. A new and challenging task, Story-to-Motion, arises when characters are required to move to various locations and perform specific motions based on a long text description. This task demands a fusion of low-level control (trajectories) and high-level control (motion semantics). Previous works in character control and text-to-motion have addressed related aspects, yet a comprehensive solution remains elusive: character control methods do not handle text description, whereas text-to-motion methods lack position constraints and often produce unstable motions. In light of these limitations, we propose a novel system that generates controllable, infinitely long motions and trajectories aligned with the input text. (1) We leverage contemporary Large Language Models to act as a text-driven motion scheduler to extract a series of (text, position, duration) pairs from long text. (2) We develop a text-driven motion retrieval scheme that incorporates motion matching with motion semantic and trajectory constraints. (3) We design a progressive mask transformer that addresses common artifacts in the transition motion such as unnatural pose and foot sliding. Beyond its pioneering role as the first comprehensive solution for Story-to-Motion, our system undergoes evaluation across three distinct sub-tasks: trajectory following, temporal action composition, and motion blending, where it outperforms previous state-of-the-art motion synthesis methods across the board. Homepage: https://story2motion.github.io/.

Story-to-Motion: Het Synthetiseren van Oneindige en Bestuurbare Karakteranimatie uit Lange Tekst

Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text

Samenvatting

Support