ChatPaper.aiChatPaper

EmoVid:面向情感中心化视频理解与生成的多模态情感视频数据集

EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation

November 14, 2025
作者: Zongyang Qiu, Bingyuan Wang, Xingbei Chen, Yingqing He, Zeyu Wang
cs.AI

摘要

情感在视频表达中具有核心地位,但现有视频生成系统主要关注低层次视觉指标而忽视情感维度。尽管情感分析在视觉领域已取得进展,视频学界仍缺乏专门资源来衔接情感理解与生成任务,尤其在风格化非现实场景中。为填补这一空白,我们推出EmoVid——首个专为创意媒体设计的多模态情感标注视频数据集,包含卡通动画、电影片段和动态贴纸。每个视频均标注有情感标签、视觉属性(亮度、色彩饱和度、色调)及文本描述。通过系统性分析,我们揭示了不同视频形式中视觉特征与情感感知的时空关联模式。基于这些发现,我们通过微调Wan2.1模型开发出情感条件视频生成技术。实验结果表明,该方法在文本到视频和图像到视频任务中,生成视频的量化指标与视觉质量均实现显著提升。EmoVid为情感化视频计算建立了新基准,不仅为艺术风格视频的视觉情感分析提供重要见解,更为增强视频生成中的情感表达提供了实用方法。
English
Emotion plays a pivotal role in video-based expression, but existing video generation systems predominantly focus on low-level visual metrics while neglecting affective dimensions. Although emotion analysis has made progress in the visual domain, the video community lacks dedicated resources to bridge emotion understanding with generative tasks, particularly for stylized and non-realistic contexts. To address this gap, we introduce EmoVid, the first multimodal, emotion-annotated video dataset specifically designed for creative media, which includes cartoon animations, movie clips, and animated stickers. Each video is annotated with emotion labels, visual attributes (brightness, colorfulness, hue), and text captions. Through systematic analysis, we uncover spatial and temporal patterns linking visual features to emotional perceptions across diverse video forms. Building on these insights, we develop an emotion-conditioned video generation technique by fine-tuning the Wan2.1 model. The results show a significant improvement in both quantitative metrics and the visual quality of generated videos for text-to-video and image-to-video tasks. EmoVid establishes a new benchmark for affective video computing. Our work not only offers valuable insights into visual emotion analysis in artistically styled videos, but also provides practical methods for enhancing emotional expression in video generation.
PDF31December 1, 2025