ChatPaper.aiChatPaper

科学教育者:基于戴明循环多代理系统的科学视频理解与教学

SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System

November 22, 2025
作者: Zhiyu Xu, Weilong Yan, Yufei Shi, Xin Meng, Tao He, Huiping Zhuang, Ming Li, Hehe Fan
cs.AI

摘要

近期多模态大语言模型与视频智能体系统的突破性进展显著提升了通用视频理解能力。然而在科学视频理解与教育这一需要融合外部专业知识并进行严谨递进推理的领域,现有方法往往表现不佳。为弥补这一差距,我们提出了SciEducator——首个面向科学视频解析与教育的迭代式自进化多智能体系统。基于管理学经典的戴明循环理论,我们的设计将其"计划-执行-检查-处理"核心理念重构为自进化推理与反馈机制,有效解析视频中复杂的科学活动。此外,SciEducator能针对特定科学过程生成多模态教育内容,包括文本指令、可视化导引、语音解说及交互式参考文献。为支持评估,我们构建了SciVBench基准数据集,包含500个经专家验证且具有文献依据的科学问答对,涵盖物理、化学及日常现象五大类别。大量实验表明,SciEducator在基准测试中显著优于主流闭源多模态大语言模型(如Gemini、GPT-4o)及最先进的视频智能体,为学界确立了新范式。
English
Recent advancements in multimodal large language models (MLLMs) and video agent systems have significantly improved general video understanding. However, when applied to scientific video understanding and educating, a domain that demands external professional knowledge integration and rigorous step-wise reasoning, existing approaches often struggle. To bridge this gap, we propose SciEducator, the first iterative self-evolving multi-agent system for scientific video comprehension and education. Rooted in the classical Deming Cycle from management science, our design reformulates its Plan-Do-Study-Act philosophy into a self-evolving reasoning and feedback mechanism, which facilitates the interpretation of intricate scientific activities in videos. Moreover, SciEducator can produce multimodal educational content tailored to specific scientific processes, including textual instructions, visual guides, audio narrations, and interactive references. To support evaluation, we construct SciVBench, a benchmark consisting of 500 expert-verified and literature-grounded science QA pairs across five categories, covering physical, chemical, and everyday phenomena. Extensive experiments demonstrate that SciEducator substantially outperforms leading closed-source MLLMs (e.g., Gemini, GPT-4o) and state-of-the-art video agents on the benchmark, establishing a new paradigm for the community.
PDF22December 1, 2025