MAPS:基于大七人格与苏格拉底式引导的多智能体框架,用于多模态科学问题求解
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving
March 21, 2025
作者: Jian Zhang, Zhiyuan Wang, Zhangqi Wang, Xinyu Zhang, Fangzhi Xu, Qika Lin, Rui Mao, Erik Cambria, Jun Liu
cs.AI
摘要
多模态科学问题(MSPs)涉及需要整合文本与图表等多种模态的复杂议题,在人工智能领域构成了重大挑战。尽管在解决传统科学问题方面已取得进展,MSPs仍面临两大主要问题:科学问题解决过程中多模态综合推理的挑战,以及缺乏反思与再思考能力。为应对这些问题,我们提出了一种基于大七人格特质与苏格拉底引导的多智能体框架(MAPS)。该框架利用七个独特智能体,通过反馈机制与苏格拉底方法指导MSPs的解决。针对第一个问题,我们设计了一种渐进式的四智能体解决策略,每个智能体专注于问题解决过程中的特定阶段。对于第二个问题,我们引入了一个受苏格拉底提问启发的批评者智能体,它激发批判性思维并促进自主学习。我们在EMMA、奥林匹克及MathVista数据集上进行了广泛实验,在所有任务中均取得了超越当前SOTA模型15.84%的显著成果。同时,附加的分析性实验也验证了模型的进步及其泛化能力。
English
Multimodal scientific problems (MSPs) involve complex issues that require the
integration of multiple modalities, such as text and diagrams, presenting a
significant challenge in artificial intelligence. While progress has been made
in addressing traditional scientific problems, MSPs still face two primary
issues: the challenge of multi-modal comprehensive reasoning in scientific
problem-solving and the lack of reflective and rethinking capabilities. To
address these issues, we introduce a Multi-Agent framework based on the Big
Seven Personality and Socratic guidance (MAPS). This framework employs seven
distinct agents that leverage feedback mechanisms and the Socratic method to
guide the resolution of MSPs. To tackle the first issue, we propose a
progressive four-agent solving strategy, where each agent focuses on a specific
stage of the problem-solving process. For the second issue, we introduce a
Critic agent, inspired by Socratic questioning, which prompts critical thinking
and stimulates autonomous learning. We conduct extensive experiments on the
EMMA, Olympiad, and MathVista datasets, achieving promising results that
outperform the current SOTA model by 15.84% across all tasks. Meanwhile, the
additional analytical experiments also verify the model's progress as well as
generalization ability.Summary
AI-Generated Summary