Keyframer: 大規模言語モデルを用いたアニメーションデザインの支援

要旨

大規模言語モデル（LLM）は、幅広い創造的領域に影響を与える可能性を秘めていますが、アニメーションへの応用は未開拓であり、ユーザーが自然言語でどのように効果的に動きを記述するかといった新たな課題を提示しています。本論文では、静止画像（SVG）を自然言語でアニメーション化するためのデザインツール「Keyframer」を紹介します。プロのアニメーションデザイナーやエンジニアとのインタビューに基づいて設計されたKeyframerは、プロンプティングと生成された出力の直接編集を組み合わせることで、アニメーションの探索と洗練を支援します。また、システムはユーザーがデザインのバリエーションを要求することを可能にし、比較とアイデア出しをサポートします。13名の参加者によるユーザー調査を通じて、動きを記述するための意味的プロンプトタイプの分類や、生成された出力に応じてユーザーが目標を継続的に適応させる「分解型」プロンプティングスタイルを含む、ユーザーのプロンプティング戦略の特性を明らかにしました。また、プロンプティングと直接編集を組み合わせることで、今日の生成ツールで一般的なワンショットプロンプティングインターフェースを超えた反復を可能にする方法を共有します。本研究を通じて、LLMがどのようにして幅広い層のユーザーにアニメーション制作への参加を可能にするかを提案します。

English

Large language models (LLMs) have the potential to impact a wide range of creative domains, but the application of LLMs to animation is underexplored and presents novel challenges such as how users might effectively describe motion in natural language. In this paper, we present Keyframer, a design tool for animating static images (SVGs) with natural language. Informed by interviews with professional animation designers and engineers, Keyframer supports exploration and refinement of animations through the combination of prompting and direct editing of generated output. The system also enables users to request design variants, supporting comparison and ideation. Through a user study with 13 participants, we contribute a characterization of user prompting strategies, including a taxonomy of semantic prompt types for describing motion and a 'decomposed' prompting style where users continually adapt their goals in response to generated output.We share how direct editing along with prompting enables iteration beyond one-shot prompting interfaces common in generative tools today. Through this work, we propose how LLMs might empower a range of audiences to engage with animation creation.

Keyframer: 大規模言語モデルを用いたアニメーションデザインの支援

Keyframer: Empowering Animation Design using Large Language Models

要旨

Support