透過基於大型語言模型的 MathAgent 建模複雜的數學推理
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
December 14, 2023
作者: Haoran Liao, Qinyi Du, Shaohua Hu, Hao He, Yanyan Xu, Jidong Tian, Yaohui Jin
cs.AI
摘要
大型語言模型(LLMs)在解決需要全面解析陳述、關聯領域知識、執行複合邏輯推理和整合中間推理的複雜數學問題方面面臨挑戰。對LLMs來說,一次應對所有這些問題可能會很困難,因此可能導致生成時的混亂。在這項工作中,我們探索了通過細致分解和建模數學推理過程來增強LLMs的潛力。具體來說,我們提出了數學求解的正式描述,並通過一個基於代理的零樣本框架,名為計劃者-推理者-執行者-反射者(PRER),來擴展LLMs。我們進一步提供並實現了兩個MathAgents,通過不同粒度和方向的一組操作定義邏輯形式和固有關係:MathAgent-M將其操作適應於LLMs,而MathAgent-H則與人類對齊。對miniF2F和MATH上的實驗證明了PRER和提出的MathAgents的有效性,分別在MiniF2F上實現了12.3%(53.9%至66.2%)的增長,在MATH上實現了9.2%(49.8%至59.0%)的增長,以及在MATH的5級問題上實現了13.2%(23.2%至35.4%)的增長,相對於GPT-4。進一步的分析結果提供了更深入的觀點,以利用LLMs作為代理的行為。
English
Large language models (LLMs) face challenges in solving complex mathematical
problems that require comprehensive capacities to parse the statements,
associate domain knowledge, perform compound logical reasoning, and integrate
the intermediate rationales. Tackling all these problems once could be arduous
for LLMs, thus leading to confusion in generation. In this work, we explore the
potential of enhancing LLMs with agents by meticulous decomposition and
modeling of mathematical reasoning process. Specifically, we propose a formal
description of the mathematical solving and extend LLMs with an agent-based
zero-shot framework named
Planner-Reasoner-Executor-Reflector (PRER). We
further provide and implement two MathAgents that define the logical forms and
inherent relations via a pool of actions in different grains and orientations:
MathAgent-M adapts its actions to LLMs, while MathAgent-H aligns with
humankind. Experiments on miniF2F and MATH have demonstrated the effectiveness
of PRER and proposed MathAgents, achieving an increase of
12.3%(53.9%66.2%) on the MiniF2F, 9.2%
(49.8%59.0%) on MATH, and
13.2%(23.2%35.4%) for level-5 problems of MATH against
GPT-4. Further analytical results provide more insightful perspectives on
exploiting the behaviors of LLMs as agents.