超越答案：利用反思训练语言模型进行数学推理

摘要

监督微调增强了语言模型在各种数学推理任务中的问题解决能力。为了最大化这些好处，现有研究侧重于通过各种数据增强技术扩展训练集，这对于标准的单轮问答设置是有效的。我们的工作引入了一种旨在培养对手头训练问题的更深入理解的新技术，不仅提高了在标准设置中的性能，还提高了在需要反思性思维的更复杂场景中的表现。具体而言，我们提出了反思增强，这是一种将问题反思嵌入到每个训练实例中的方法。它训练模型考虑替代视角，并与抽象和类比进行互动，从而通过反思推理培养全面理解。大量实验证实了我们的目标的实现，突显了我们的方法的独特优势及其相对于现有增强技术的互补性质。

English

Supervised fine-tuning enhances the problem-solving abilities of language models across various mathematical reasoning tasks. To maximize such benefits, existing research focuses on broadening the training set with various data augmentation techniques, which is effective for standard single-round question-answering settings. Our work introduces a novel technique aimed at cultivating a deeper understanding of the training problems at hand, enhancing performance not only in standard settings but also in more complex scenarios that require reflective thinking. Specifically, we propose reflective augmentation, a method that embeds problem reflection into each training instance. It trains the model to consider alternative perspectives and engage with abstractions and analogies, thereby fostering a thorough comprehension through reflective reasoning. Extensive experiments validate the achievement of our aim, underscoring the unique advantages of our method and its complementary nature relative to existing augmentation techniques.

超越答案：利用反思训练语言模型进行数学推理

Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning

摘要

Support