長動畫:基於動態全局-局部記憶的長動畫生成
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
July 2, 2025
作者: Nan Chen, Mengqi Huang, Yihao Meng, Zhendong Mao
cs.AI
摘要
动画上色是真实动画产业生产中的关键环节。长篇动画上色具有高昂的人力成本,因此,基于视频生成模型的自动化长篇动画上色研究具有重要价值。现有研究多局限于短期上色,采用局部范式,通过融合重叠特征实现局部片段间的平滑过渡。然而,局部范式忽视了全局信息,难以维持长期色彩一致性。本研究主张,理想的长期色彩一致性可通过动态全局-局部范式实现,即动态提取与当前生成相关的全局色彩一致特征。具体而言,我们提出了LongAnimation这一新颖框架,主要包括SketchDiT、动态全局-局部记忆模块(DGLM)及色彩一致性奖励机制。SketchDiT捕捉混合参考特征以支持DGLM模块。DGLM模块利用长视频理解模型动态压缩全局历史特征,并自适应地将其与当前生成特征融合。为优化色彩一致性,我们引入了色彩一致性奖励机制。在推理阶段,我们提出色彩一致性融合策略以平滑视频片段过渡。在短期(14帧)与长期(平均500帧)动画上的大量实验表明,LongAnimation在开放域动画上色任务中,能有效维持短期与长期的色彩一致性。代码可访问https://cn-makers.github.io/long_animation_web/获取。
English
Animation colorization is a crucial part of real animation industry
production. Long animation colorization has high labor costs. Therefore,
automated long animation colorization based on the video generation model has
significant research value. Existing studies are limited to short-term
colorization. These studies adopt a local paradigm, fusing overlapping features
to achieve smooth transitions between local segments. However, the local
paradigm neglects global information, failing to maintain long-term color
consistency. In this study, we argue that ideal long-term color consistency can
be achieved through a dynamic global-local paradigm, i.e., dynamically
extracting global color-consistent features relevant to the current generation.
Specifically, we propose LongAnimation, a novel framework, which mainly
includes a SketchDiT, a Dynamic Global-Local Memory (DGLM), and a Color
Consistency Reward. The SketchDiT captures hybrid reference features to support
the DGLM module. The DGLM module employs a long video understanding model to
dynamically compress global historical features and adaptively fuse them with
the current generation features. To refine the color consistency, we introduce
a Color Consistency Reward. During inference, we propose a color consistency
fusion to smooth the video segment transition. Extensive experiments on both
short-term (14 frames) and long-term (average 500 frames) animations show the
effectiveness of LongAnimation in maintaining short-term and long-term color
consistency for open-domain animation colorization task. The code can be found
at https://cn-makers.github.io/long_animation_web/.