ChatPaper.aiChatPaper

EgoEdit:面向第一人称视频编辑的数据集、实时流式模型与基准测试平台

EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

December 5, 2025
作者: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi Menapace
cs.AI

摘要

我们研究面向交互式AR应用的第一人称视频指令引导编辑技术。当前AI视频编辑器虽在第三人称视频上表现良好,但第一人称视角存在独特挑战——包括快速自身运动和频繁的手物交互——这造成了显著的领域差异。此外,现有离线编辑流程存在高延迟问题,限制了实时交互能力。针对这些问题,我们提出了一套完整的第一人称视频编辑生态系统。首先,我们构建了EgoEditData——一个专为第一人称编辑场景精心设计并手动标注的数据集,其特点在于包含丰富的手物交互且显式保留手部信息。其次,我们开发了EgoEdit支持实时流式推理的第一人称视频编辑器,可在单GPU上运行。最后,我们推出EgoEditBench评估体系,重点考察指令遵循度、手部与交互保持能力,以及自身运动下的时序稳定性。在第一人称和通用编辑任务中,EgoEdit均能以交互级延迟生成时序稳定、忠实遵循指令的结果。它在现有方法表现不佳的第一人称编辑基准上取得显著提升,同时在通用编辑任务上保持与最强基线相当的性能。EgoEditData与EgoEditBench将向研究社区公开,详情请访问我们的网站https://snap-research.github.io/EgoEdit。
English
We study instruction-guided editing of egocentric videos for interactive AR applications. While recent AI video editors perform well on third-person footage, egocentric views present unique challenges - including rapid egomotion and frequent hand-object interactions - that create a significant domain gap. Moreover, existing offline editing pipelines suffer from high latency, limiting real-time interaction. To address these issues, we present a complete ecosystem for egocentric video editing. First, we construct EgoEditData, a carefully designed and manually curated dataset specifically designed for egocentric editing scenarios, featuring rich hand-object interactions, while explicitly preserving hands. Second, we develop EgoEdit, an instruction-following egocentric video editor that supports real-time streaming inference on a single GPU. Finally, we introduce EgoEditBench, an evaluation suite targeting instruction faithfulness, hand and interaction preservation, and temporal stability under egomotion. Across both egocentric and general editing tasks, EgoEdit produces temporally stable, instruction-faithful results with interactive latency. It achieves clear gains on egocentric editing benchmarks-where existing methods struggle-while maintaining performance comparable to the strongest baselines on general editing tasks. EgoEditData and EgoEditBench will be made public for the research community. See our website at https://snap-research.github.io/EgoEdit
PDF212December 10, 2025