MathCoder：在LLM中实现无缝代码集成，以增强数学推理

摘要

最近发布的GPT-4代码解释器展示了在解决具有挑战性的数学问题方面的显著熟练度，主要归功于其能够无缝地运用自然语言推理、生成代码、执行代码，并根据执行输出继续推理的能力。在本文中，我们提出了一种微调开源语言模型的方法，使它们能够使用代码来建模和推导数学方程，从而增强它们的数学推理能力。我们提出了一种生成包含数学问题及基于代码的解决方案的新颖高质量数据集的方法，称为MathCodeInstruct。每个解决方案交织着自然语言、代码和执行结果。我们还介绍了一种定制的监督微调和推理方法。这种方法产生了MathCoder模型系列，这些模型能够生成用于解决具有挑战性数学问题的基于代码的解决方案。令人印象深刻的是，MathCoder模型在MATH（45.2%）和GSM8K（83.9%）数据集上取得了开源LLM中的最新成绩，明显优于其他开源替代方案。值得注意的是，MathCoder模型不仅在GSM8K和MATH上超越了ChatGPT-3.5和PaLM-2，而且在竞赛级别的MATH数据集上也胜过了GPT-4。数据集和模型将在https://github.com/mathllm/MathCoder 上发布。

English

The recently released GPT-4 Code Interpreter has demonstrated remarkable proficiency in solving challenging math problems, primarily attributed to its ability to seamlessly reason with natural language, generate code, execute code, and continue reasoning based on the execution output. In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities. We propose a method of generating novel and high-quality datasets with math problems and their code-based solutions, referred to as MathCodeInstruct. Each solution interleaves natural language, code, and execution results. We also introduce a customized supervised fine-tuning and inference approach. This approach yields the MathCoder models, a family of models capable of generating code-based solutions for solving challenging math problems. Impressively, the MathCoder models achieve state-of-the-art scores among open-source LLMs on the MATH (45.2%) and GSM8K (83.9%) datasets, substantially outperforming other open-source alternatives. Notably, the MathCoder model not only surpasses ChatGPT-3.5 and PaLM-2 on GSM8K and MATH but also outperforms GPT-4 on the competition-level MATH dataset. The dataset and models will be released at https://github.com/mathllm/MathCoder.

MathCoder：在LLM中实现无缝代码集成，以增强数学推理

MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning

摘要

Support