MITS：基于点互信息增强的大型语言模型树搜索推理

摘要

树搜索已成为大型语言模型（LLMs）测试时推理的代表性框架，典型方法如“思维树”和蒙特卡洛树搜索，它们探索多种推理路径。然而，对中间推理步骤质量进行即时且可靠的定量评估仍具挑战性，且广泛的路径探索计算成本高昂。为此，我们提出了互信息树搜索（MITS），一个基于信息论原理指导推理的新框架。MITS引入了一种基于点互信息（PMI）的有效评分函数，能够逐步评估推理路径并通过束搜索扩展搜索树，无需昂贵的向前模拟，在保持计算效率的同时实现了卓越的推理性能。该框架辅以基于熵的动态采样策略，自适应地将计算资源分配到探索最为有益的不确定推理步骤上。对于最终预测，MITS采用加权投票方案，将PMI评分与预测共识相结合。通过在多样化推理基准上的全面实验，MITS持续超越基线方法，为LLM推理建立了一个原则性强且高效的框架。

English

Tree search has become as a representative framework for test-time reasoning with large language models (LLMs), exemplified by methods such as Tree-of-Thought and Monte Carlo Tree Search that explore multiple reasoning paths. However, it remains difficult to provide instant and reliable quantitative assessments of intermediate reasoning step quality, and extensive path exploration is computationally costly. To address this, we propose Mutual Information Tree Search (MITS), a novel framework that guides reasoning with information-theoretic principles. MITS introduces an effective scoring function based on pointwise mutual information (PMI), which enables step-wise evaluation of reasoning paths and search tree expansion via beam search without expensive look-ahead simulations, achieving superior reasoning performances while maintaining computational efficiency. The framework is complemented by an entropy-based dynamic sampling strategy that adaptively allocates computational resources to uncertain reasoning steps where exploration is most beneficial. For final prediction, MITS employs a weighted voting scheme that combines PMI scores with prediction consensus. Through comprehensive experiments on diverse reasoning benchmarks, MITS consistently surpasses baseline methods, establishing a principled and efficient framework for LLM reasoning.

MITS：基于点互信息增强的大型语言模型树搜索推理

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

摘要

Support