ChatPaper.aiChatPaper

InternLM-Math:開放式數學大型語言模型朝向可驗證推理

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

February 9, 2024
作者: Huaiyuan Ying, Shuo Zhang, Linyang Li, Zhejian Zhou, Yunfan Shao, Zhaoye Fei, Yichuan Ma, Jiawei Hong, Kuikun Liu, Ziyi Wang, Yudong Wang, Zijian Wu, Shuaibin Li, Fengzhe Zhou, Hongwei Liu, Songyang Zhang, Wenwei Zhang, Hang Yan, Xipeng Qiu, Jiayu Wang, Kai Chen, Dahua Lin
cs.AI

摘要

大型語言模型的數學能力可以代表其抽象推理能力。在本文中,我們介紹並開源我們的數學推理LLMs InternLM-Math,該模型是從InternLM2繼續預訓練而來。我們將思維鏈推理、獎勵建模、形式推理、數據增強和代碼解釋器統一在一個統一的seq2seq格式中,並監督我們的模型成為一個多才多藝的數學推理者、驗證者、證明者和增強者。這些能力可以用於開發下一代數學LLMs或自我迭代。InternLM-Math在上下文學習、監督微調和代碼輔助推理的情況下,在各種非正式和正式基準測試中(包括GSM8K、MATH、匈牙利數學考試、MathBench-ZH和MiniF2F)獲得了開源的最先進性能。我們的預訓練模型在未進行微調的情況下在MiniF2F測試集上達到了30.3的分數。我們進一步探索了如何使用LEAN來解決數學問題,並研究了在多任務學習情況下的性能,顯示了使用LEAN作為解決和證明數學問題的統一平台的可能性。我們的模型、代碼和數據已在https://github.com/InternLM/InternLM-Math 上發布。
English
The math abilities of large language models can represent their abstract reasoning ability. In this paper, we introduce and open-source our math reasoning LLMs InternLM-Math which is continue pre-trained from InternLM2. We unify chain-of-thought reasoning, reward modeling, formal reasoning, data augmentation, and code interpreter in a unified seq2seq format and supervise our model to be a versatile math reasoner, verifier, prover, and augmenter. These abilities can be used to develop the next math LLMs or self-iteration. InternLM-Math obtains open-sourced state-of-the-art performance under the setting of in-context learning, supervised fine-tuning, and code-assisted reasoning in various informal and formal benchmarks including GSM8K, MATH, Hungary math exam, MathBench-ZH, and MiniF2F. Our pre-trained model achieves 30.3 on the MiniF2F test set without fine-tuning. We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math. Our models, codes, and data are released at https://github.com/InternLM/InternLM-Math.
PDF201December 15, 2024