ChatPaper.aiChatPaper

潜在思维链作为规划:将推理与言语化解耦

Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

January 29, 2026
作者: Jiecong Wang, Hao Peng, Chunyang Liu
cs.AI

摘要

思維鏈(CoT)技術使大型語言模型(LLM)能夠處理複雜問題,但其基於離散詞元空間的推理過程仍受計算成本高和推理路徑坍塌的雙重制約。近期潛在推理方法嘗試通過在連續隱藏狀態中執行推理來優化效率,然而這些方法通常僅作為從顯式推理步驟到潛在狀態的不透明端到端映射,且推斷時往往需要預定義潛在步驟數量。本文提出潛在思維規劃框架PLaT,通過將推理與言語化過程根本性解耦,把潛在推理重新定義為規劃問題。我們將推理建模為潛在規劃狀態的確定性軌跡,而獨立的解碼器在必要時將這些思維具象化為文本。這種解耦機制使模型能動態決定何時終止推理,而非依賴固定超參數。數學基準測試的實證結果揭示出獨特的權衡:雖然PLaT的貪婪準確率低於基線模型,但其在推理多樣性維度展現出更優的可擴展性,表明PLaT學習到了一個更魯棒、更寬廣的解空間,為推斷時搜索提供了透明且可擴展的基礎框架。
English
Chain-of-Thought (CoT) empowers Large Language Models (LLMs) to tackle complex problems, but remains constrained by the computational cost and reasoning path collapse when grounded in discrete token spaces. Recent latent reasoning approaches attempt to optimize efficiency by performing reasoning within continuous hidden states. However, these methods typically operate as opaque end-to-end mappings from explicit reasoning steps to latent states, and often require a pre-defined number of latent steps during inference. In this work, we introduce PLaT (Planning with Latent Thoughts), a framework that reformulates latent reasoning as planning by fundamentally decouple reasoning from verbalization. We model reasoning as a deterministic trajectory of latent planning states, while a separate Decoder grounds these thoughts into text when necessary. This decoupling allows the model to dynamically determine when to terminate reasoning rather than relying on fixed hyperparameters. Empirical results on mathematical benchmarks reveal a distinct trade-off: while PLaT achieves lower greedy accuracy than baselines, it demonstrates superior scalability in terms of reasoning diversity. This indicates that PLaT learns a robust, broader solution space, offering a transparent and scalable foundation for inference-time search.
PDF52February 3, 2026