利用GPT-4进行具有挑战性的数学问题求解的实证研究
An Empirical Study on Challenging Math Problem Solving with GPT-4
June 2, 2023
作者: Yiran Wu, Feiran Jia, Shaokun Zhang, Qingyun Wu, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Chi Wang
cs.AI
摘要
利用大型语言模型(LLMs)来解决数学问题是一个引人入胜的研究尝试,考虑到在许多科学和工程领域中用自然语言表达的数学问题的丰富性。虽然之前有几项研究探讨了使用LLMs解决基础数学问题,但本研究探索了使用GPT-4来解决更复杂和具有挑战性的数学问题的前沿。我们评估了使用GPT-4的各种方式。其中一些是改编自现有工作,另一种是\MathChat,这是本研究新提出的一种对话式问题解决框架。我们在MATH数据集中的困难高中竞赛问题上进行评估,结果显示了所提出的对话式方法的优势。
English
Employing Large Language Models (LLMs) to address mathematical problems is an
intriguing research endeavor, considering the abundance of math problems
expressed in natural language across numerous science and engineering fields.
While several prior works have investigated solving elementary mathematics
using LLMs, this work explores the frontier of using GPT-4 for solving more
complex and challenging math problems. We evaluate various ways of using GPT-4.
Some of them are adapted from existing work, and one is \MathChat, a
conversational problem-solving framework newly proposed in this work. We
perform the evaluation on difficult high school competition problems from the
MATH dataset, which shows the advantage of the proposed conversational
approach.