通過混合思維學習推理以實現邏輯推理
Learning to Reason via Mixture-of-Thought for Logical Reasoning
May 21, 2025
作者: Tong Zheng, Lichang Chen, Simeng Han, R. Thomas McCoy, Heng Huang
cs.AI
摘要
人類自然運用多種推理模式來學習和解決邏輯問題,即不同的表徵形式,如自然語言、代碼和符號邏輯。相比之下,現有大多數基於大語言模型(LLM)的方法在訓練過程中僅使用單一推理模式,通常是自然語言。儘管某些方法在推理時探索了模式選擇或增強,但訓練過程仍對模式無感知,限制了模式間的協同效應。為填補這一空白,我們提出了混合思維(Mixture-of-Thought, MoT)框架,使LLM能夠在三個互補模式間進行推理:自然語言、代碼以及新引入的符號模式——真值表,後者系統地列舉邏輯案例,並部分緩解了自然語言推理中的關鍵失敗模式。MoT採用兩階段設計:(1)自我演進的MoT訓練,從跨模式的過濾、自我生成的推理中共同學習;(2)MoT推理,充分利用三種模式的協同效應以產生更優預測。在包括FOLIO和ProofWriter在內的邏輯推理基準測試中,我們的MoT框架一致且顯著地超越了採用單一模式思維鏈的強LLM基線,平均準確率提升高達+11.7個百分點。進一步分析表明,MoT框架對訓練和推理階段均有裨益;在更難的邏輯推理問題上尤為有效;且不同模式貢獻了互補的優勢,其中真值表推理有助於克服自然語言推理中的關鍵瓶頸。
English
Human beings naturally utilize multiple reasoning modalities to learn and
solve logical problems, i.e., different representational formats such as
natural language, code, and symbolic logic. In contrast, most existing
LLM-based approaches operate with a single reasoning modality during training,
typically natural language. Although some methods explored modality selection
or augmentation at inference time, the training process remains modality-blind,
limiting synergy among modalities. To fill in this gap, we propose
Mixture-of-Thought (MoT), a framework that enables LLMs to reason across three
complementary modalities: natural language, code, and a newly introduced
symbolic modality, truth-table, which systematically enumerates logical cases
and partially mitigates key failure modes in natural language reasoning. MoT
adopts a two-phase design: (1) self-evolving MoT training, which jointly learns
from filtered, self-generated rationales across modalities; and (2) MoT
inference, which fully leverages the synergy of three modalities to produce
better predictions. Experiments on logical reasoning benchmarks including FOLIO
and ProofWriter demonstrate that our MoT framework consistently and
significantly outperforms strong LLM baselines with single-modality
chain-of-thought approaches, achieving up to +11.7pp average accuracy gain.
Further analyses show that our MoT framework benefits both training and
inference stages; that it is particularly effective on harder logical reasoning
problems; and that different modalities contribute complementary strengths,
with truth-table reasoning helping to overcome key bottlenecks in natural
language inference.Summary
AI-Generated Summary