MathCoder:在大型語言模型中實現無縫代碼整合以增強數學推理能力
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
October 5, 2023
作者: Ke Wang, Houxing Ren, Aojun Zhou, Zimu Lu, Sichun Luo, Weikang Shi, Renrui Zhang, Linqi Song, Mingjie Zhan, Hongsheng Li
cs.AI
摘要
最新發布的GPT-4代碼解釋器在解決複雜數學問題方面展現出卓越能力,其核心優勢在於能流暢地進行自然語言推理、生成代碼、執行代碼並根據執行結果持續推演。本文提出一種針對開源語言模型的微調方法,使其能運用代碼進行數學方程式建模與推導,從而提升數學推理能力。我們設計了MathCodeInstruct數據集生成方法,創建包含數學問題及其代碼解決方案的新穎高質量數據集,每個解決方案均交織自然語言、代碼與執行結果。同時提出定制化的監督微調與推理框架,據此培育出MathCoder模型系列——該系列模型能生成基於代碼的解決方案來攻克複雜數學難題。令人矚目的是,MathCoder模型在MATH(45.2%)和GSM8K(83.9%)數據集上刷新了開源LLM的評分紀錄,顯著超越其他開源方案。更突出的是,MathCoder不僅在GSM8K和MATH數據集上勝過ChatGPT-3.5與PaLM-2,更在競賽級MATH數據集上超越GPT-4。相關數據集與模型將發佈於https://github.com/mathllm/MathCoder。
English
The recently released GPT-4 Code Interpreter has demonstrated remarkable
proficiency in solving challenging math problems, primarily attributed to its
ability to seamlessly reason with natural language, generate code, execute
code, and continue reasoning based on the execution output. In this paper, we
present a method to fine-tune open-source language models, enabling them to use
code for modeling and deriving math equations and, consequently, enhancing
their mathematical reasoning abilities. We propose a method of generating novel
and high-quality datasets with math problems and their code-based solutions,
referred to as MathCodeInstruct. Each solution interleaves natural language,
code, and execution results. We also introduce a customized supervised
fine-tuning and inference approach. This approach yields the MathCoder models,
a family of models capable of generating code-based solutions for solving
challenging math problems. Impressively, the MathCoder models achieve
state-of-the-art scores among open-source LLMs on the MATH (45.2%) and GSM8K
(83.9%) datasets, substantially outperforming other open-source alternatives.
Notably, the MathCoder model not only surpasses ChatGPT-3.5 and PaLM-2 on GSM8K
and MATH but also outperforms GPT-4 on the competition-level MATH dataset. The
dataset and models will be released at https://github.com/mathllm/MathCoder.