超越答案：利用反思訓練語言模型進行數學推理

摘要

監督式微調增強了語言模型在各種數學推理任務中的問題解決能力。為了最大化這些好處，現有研究著重於通過各種數據擴充技術擴展訓練集，這對於標準的單輪問答設置是有效的。我們的工作引入了一種新穎的技術，旨在培養對當前訓練問題的更深入理解，提升不僅在標準設置中的表現，還在需要反思性思考的更複雜情境中的表現。具體來說，我們提出了反思擴充，一種將問題反思嵌入每個訓練實例的方法。它訓練模型考慮替代觀點，並從事抽象和類比，從而通過反思推理培養全面理解。大量實驗驗證了我們目標的實現，突顯了我們方法的獨特優勢及其相對於現有擴充技術的互補性。

English

Supervised fine-tuning enhances the problem-solving abilities of language models across various mathematical reasoning tasks. To maximize such benefits, existing research focuses on broadening the training set with various data augmentation techniques, which is effective for standard single-round question-answering settings. Our work introduces a novel technique aimed at cultivating a deeper understanding of the training problems at hand, enhancing performance not only in standard settings but also in more complex scenarios that require reflective thinking. Specifically, we propose reflective augmentation, a method that embeds problem reflection into each training instance. It trains the model to consider alternative perspectives and engage with abstractions and analogies, thereby fostering a thorough comprehension through reflective reasoning. Extensive experiments validate the achievement of our aim, underscoring the unique advantages of our method and its complementary nature relative to existing augmentation techniques.

超越答案：利用反思訓練語言模型進行數學推理

Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning

摘要

Support