通过思维混合学习推理以实现逻辑推理
Learning to Reason via Mixture-of-Thought for Logical Reasoning
May 21, 2025
作者: Tong Zheng, Lichang Chen, Simeng Han, R. Thomas McCoy, Heng Huang
cs.AI
摘要
人类天生运用多种推理模式来学习和解决逻辑问题,即不同的表征形式,如自然语言、代码和符号逻辑。相比之下,现有大多数基于大语言模型(LLM)的方法在训练过程中仅采用单一推理模式,通常是自然语言。尽管有些方法在推理时探索了模式选择或增强,但训练过程仍对模式视而不见,限制了模式间的协同效应。为填补这一空白,我们提出了“思维混合”(Mixture-of-Thought, MoT)框架,使LLM能够在三种互补模式间进行推理:自然语言、代码,以及新引入的符号模式——真值表,后者系统地枚举逻辑案例,部分缓解了自然语言推理中的关键失败模式。MoT采用两阶段设计:(1)自我进化的MoT训练,联合学习跨模式过滤后的自生成推理依据;(2)MoT推理,充分利用三种模式的协同作用,以产生更优预测。在包括FOLIO和ProofWriter在内的逻辑推理基准测试中,实验表明我们的MoT框架始终显著优于采用单一模式链式思维(chain-of-thought)的强LLM基线,平均准确率提升高达+11.7个百分点。进一步分析显示,MoT框架在训练和推理阶段均带来益处;尤其在解决更难的逻辑推理问题时效果显著;不同模式贡献了互补优势,其中真值表推理有助于克服自然语言推理中的关键瓶颈。
English
Human beings naturally utilize multiple reasoning modalities to learn and
solve logical problems, i.e., different representational formats such as
natural language, code, and symbolic logic. In contrast, most existing
LLM-based approaches operate with a single reasoning modality during training,
typically natural language. Although some methods explored modality selection
or augmentation at inference time, the training process remains modality-blind,
limiting synergy among modalities. To fill in this gap, we propose
Mixture-of-Thought (MoT), a framework that enables LLMs to reason across three
complementary modalities: natural language, code, and a newly introduced
symbolic modality, truth-table, which systematically enumerates logical cases
and partially mitigates key failure modes in natural language reasoning. MoT
adopts a two-phase design: (1) self-evolving MoT training, which jointly learns
from filtered, self-generated rationales across modalities; and (2) MoT
inference, which fully leverages the synergy of three modalities to produce
better predictions. Experiments on logical reasoning benchmarks including FOLIO
and ProofWriter demonstrate that our MoT framework consistently and
significantly outperforms strong LLM baselines with single-modality
chain-of-thought approaches, achieving up to +11.7pp average accuracy gain.
Further analyses show that our MoT framework benefits both training and
inference stages; that it is particularly effective on harder logical reasoning
problems; and that different modalities contribute complementary strengths,
with truth-table reasoning helping to overcome key bottlenecks in natural
language inference.Summary
AI-Generated Summary