OmniLottie：透過參數化Lottie代碼生成向量動畫

摘要

OmniLottie 是一個多功能框架，能根據多模態指令生成高品質向量動畫。為實現靈活的運動與視覺內容控制，我們聚焦於 Lottie——一種用於描述圖形與動畫行為的輕量級 JSON 格式。然而，原始 Lottie JSON 文件包含大量不變的結構化元數據與格式標記，對學習向量動畫生成構成重大挑戰。為此，我們設計了專用的 Lottie 標記化器，可將 JSON 文件轉換為結構化的指令與參數序列，用以表徵圖形、動畫功能及控制參數。此標記化器使我們能基於預訓練視覺語言模型構建 OmniLottie，使其遵循多模態交錯指令並生成高品質向量動畫。為推動向量動畫生成研究，我們還構建了 MMLottie-2M 大規模數據集，包含專業設計的向量動畫及其對應的文本與視覺註釋。透過大量實驗驗證，OmniLottie 能生成生動且語義對齊的向量動畫，精準遵循多模態人類指令。

English

OmniLottie is a versatile framework that generates high quality vector animations from multi-modal instructions. For flexible motion and visual content control, we focus on Lottie, a light weight JSON formatting for both shapes and animation behaviors representation. However, the raw Lottie JSON files contain extensive invariant structural metadata and formatting tokens, posing significant challenges for learning vector animation generation. Therefore, we introduce a well designed Lottie tokenizer that transforms JSON files into structured sequences of commands and parameters representing shapes, animation functions and control parameters. Such tokenizer enables us to build OmniLottie upon pretrained vision language models to follow multi-modal interleaved instructions and generate high quality vector animations. To further advance research in vector animation generation, we curate MMLottie-2M, a large scale dataset of professionally designed vector animations paired with textual and visual annotations. With extensive experiments, we validate that OmniLottie can produce vivid and semantically aligned vector animations that adhere closely to multi modal human instructions.

OmniLottie：透過參數化Lottie代碼生成向量動畫

OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens

摘要

Support