ChatPaper.aiChatPaper

推理即压缩:通过条件信息瓶颈统一预算强制

Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

March 9, 2026
作者: Fabio Valerio Massoli, Andrey Kuzmin, Arash Behboodi
cs.AI

摘要

思维链提示虽能提升大模型在复杂任务上的准确率,但常伴随令牌使用量与推理成本的增加。现有“预算强制”方法通过启发式长度惩罚进行微调以降低成本,却同时抑制了关键推理与冗余填充内容。我们将高效推理重构为信息瓶颈原理下的有损压缩问题,并发现直接应用朴素IB到Transformer时存在关键理论缺陷:注意力机制违反了提示、推理轨迹与响应之间的马尔可夫属性。为解决此问题,我们在条件信息瓶颈框架下建立思维链生成模型,其中推理轨迹Z作为计算桥梁,仅保留无法从提示X直接获取的响应Y相关信息。由此推导出通用强化学习目标:在推理轨迹先验分布下压缩生成内容的同时最大化任务奖励,将常见启发式方法(如长度惩罚)归纳为特例(如均匀先验)。与基于简单令牌计数的方案不同,我们引入语义先验,通过语言模型先验下的惊异值衡量令牌成本。实验表明,我们的CIB目标能有效修剪认知冗余,同时保持流畅性与逻辑性,在适度压缩下提升准确率,并在激进压缩时实现最小精度损失。
English
Chain-of-Thought (CoT) prompting improves LLM accuracy on complex tasks but often increases token usage and inference cost. Existing "Budget Forcing" methods reducing cost via fine-tuning with heuristic length penalties, suppress both essential reasoning and redundant filler. We recast efficient reasoning as a lossy compression problem under the Information Bottleneck (IB) principle, and identify a key theoretical gap when applying naive IB to transformers: attention violates the Markov property between prompt, reasoning trace, and response. To resolve this issue, we model CoT generation under the Conditional Information Bottleneck (CIB) principle, where the reasoning trace Z acts as a computational bridge that contains only the information about the response Y that is not directly accessible from the prompt X. This yields a general Reinforcement Learning objective: maximize task reward while compressing completions under a prior over reasoning traces, subsuming common heuristics (e.g., length penalties) as special cases (e.g., uniform priors). In contrast to naive token-counting-based approaches, we introduce a semantic prior that measures token cost by surprisal under a language model prior. Empirically, our CIB objective prunes cognitive bloat while preserving fluency and logic, improving accuracy at moderate compression and enabling aggressive compression with minimal accuracy drop.
PDF131March 24, 2026