ChatPaper.aiChatPaper

通過混合思維學習推理以實現邏輯推理

Learning to Reason via Mixture-of-Thought for Logical Reasoning

May 21, 2025
作者: Tong Zheng, Lichang Chen, Simeng Han, R. Thomas McCoy, Heng Huang
cs.AI

摘要

人類自然運用多種推理模式來學習和解決邏輯問題,即不同的表徵形式,如自然語言、代碼和符號邏輯。相比之下,現有大多數基於大語言模型(LLM)的方法在訓練過程中僅使用單一推理模式,通常是自然語言。儘管某些方法在推理時探索了模式選擇或增強,但訓練過程仍對模式無感知,限制了模式間的協同效應。為填補這一空白,我們提出了混合思維(Mixture-of-Thought, MoT)框架,使LLM能夠在三個互補模式間進行推理:自然語言、代碼以及新引入的符號模式——真值表,後者系統地列舉邏輯案例,並部分緩解了自然語言推理中的關鍵失敗模式。MoT採用兩階段設計:(1)自我演進的MoT訓練,從跨模式的過濾、自我生成的推理中共同學習;(2)MoT推理,充分利用三種模式的協同效應以產生更優預測。在包括FOLIO和ProofWriter在內的邏輯推理基準測試中,我們的MoT框架一致且顯著地超越了採用單一模式思維鏈的強LLM基線,平均準確率提升高達+11.7個百分點。進一步分析表明,MoT框架對訓練和推理階段均有裨益;在更難的邏輯推理問題上尤為有效;且不同模式貢獻了互補的優勢,其中真值表推理有助於克服自然語言推理中的關鍵瓶頸。
English
Human beings naturally utilize multiple reasoning modalities to learn and solve logical problems, i.e., different representational formats such as natural language, code, and symbolic logic. In contrast, most existing LLM-based approaches operate with a single reasoning modality during training, typically natural language. Although some methods explored modality selection or augmentation at inference time, the training process remains modality-blind, limiting synergy among modalities. To fill in this gap, we propose Mixture-of-Thought (MoT), a framework that enables LLMs to reason across three complementary modalities: natural language, code, and a newly introduced symbolic modality, truth-table, which systematically enumerates logical cases and partially mitigates key failure modes in natural language reasoning. MoT adopts a two-phase design: (1) self-evolving MoT training, which jointly learns from filtered, self-generated rationales across modalities; and (2) MoT inference, which fully leverages the synergy of three modalities to produce better predictions. Experiments on logical reasoning benchmarks including FOLIO and ProofWriter demonstrate that our MoT framework consistently and significantly outperforms strong LLM baselines with single-modality chain-of-thought approaches, achieving up to +11.7pp average accuracy gain. Further analyses show that our MoT framework benefits both training and inference stages; that it is particularly effective on harder logical reasoning problems; and that different modalities contribute complementary strengths, with truth-table reasoning helping to overcome key bottlenecks in natural language inference.

Summary

AI-Generated Summary

PDF122May 22, 2025