通过基于大型语言模型的MathAgent对复杂数学推理进行建模。
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
December 14, 2023
作者: Haoran Liao, Qinyi Du, Shaohua Hu, Hao He, Yanyan Xu, Jidong Tian, Yaohui Jin
cs.AI
摘要
大型语言模型(LLMs)在解决需要全面解析陈述、关联领域知识、执行复合逻辑推理和整合中间推理的复杂数学问题方面面临挑战。一次性解决所有这些问题对LLMs来说可能是困难的,因此可能导致生成过程中的混乱。在这项工作中,我们通过精心分解和建模数学推理过程,探讨了通过代理增强LLMs的潜力。具体地,我们提出了数学求解的形式化描述,并使用基于代理的零-shot框架PRER对LLMs进行扩展。我们进一步提供并实现了两个MathAgents,通过不同粒度和方向的一系列动作定义逻辑形式和内在关系:MathAgent-M将其动作调整到LLMs,而MathAgent-H与人类对齐。在miniF2F和MATH上的实验表明了PRER和提出的MathAgents的有效性,分别在MiniF2F上实现了12.3%(53.9%至66.2%)的增长,在MATH上实现了9.2%(49.8%至59.0%)的增长,以及在MATH的5级问题上实现了13.2%(23.2%至35.4%)的增长,相对于GPT-4。进一步的分析结果提供了更深入的洞察,以利用LLMs作为代理的行为。
English
Large language models (LLMs) face challenges in solving complex mathematical
problems that require comprehensive capacities to parse the statements,
associate domain knowledge, perform compound logical reasoning, and integrate
the intermediate rationales. Tackling all these problems once could be arduous
for LLMs, thus leading to confusion in generation. In this work, we explore the
potential of enhancing LLMs with agents by meticulous decomposition and
modeling of mathematical reasoning process. Specifically, we propose a formal
description of the mathematical solving and extend LLMs with an agent-based
zero-shot framework named
Planner-Reasoner-Executor-Reflector (PRER). We
further provide and implement two MathAgents that define the logical forms and
inherent relations via a pool of actions in different grains and orientations:
MathAgent-M adapts its actions to LLMs, while MathAgent-H aligns with
humankind. Experiments on miniF2F and MATH have demonstrated the effectiveness
of PRER and proposed MathAgents, achieving an increase of
12.3%(53.9%66.2%) on the MiniF2F, 9.2%
(49.8%59.0%) on MATH, and
13.2%(23.2%35.4%) for level-5 problems of MATH against
GPT-4. Further analytical results provide more insightful perspectives on
exploiting the behaviors of LLMs as agents.