軟性思考:在連續概念空間中釋放大型語言模型的推理潛能
Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
May 21, 2025
作者: Zhen Zhang, Xuehai He, Weixiang Yan, Ao Shen, Chenyang Zhao, Shuohang Wang, Yelong Shen, Xin Eric Wang
cs.AI
摘要
人類認知通常涉及對抽象、流動概念的思考,而非嚴格使用離散的語言符號。然而,當前的推理模型受限於人類語言的邊界,處理代表語義空間中固定點的離散符號嵌入。這種離散性限制約束了此類推理模型的表達能力和上限潛力,常常導致推理路徑的不完全探索,因為標準的思維鏈(Chain-of-Thought, CoT)方法依賴於每一步採樣一個符號。在本研究中,我們提出了軟性思考(Soft Thinking),這是一種無需訓練的方法,通過在連續概念空間中生成柔軟、抽象的概念符號來模擬人類的“軟性”推理。這些概念符號由符號嵌入的概率加權混合生成,形成連續概念空間,實現了平滑過渡和超越傳統離散邊界的更豐富表示。本質上,每個生成的概念符號都封裝了來自相關離散符號的多重含義,隱式地探索了多種推理路徑,從而有效地收斂到正確答案。在多樣化的數學和編程基準上的實證評估一致證明了軟性思考的有效性和效率,與標準CoT相比,pass@1準確率提升了最高2.48個百分點,同時符號使用量減少了最高22.4%。定性分析進一步揭示,軟性思考的輸出保持高度可解釋性和可讀性,凸顯了其突破基於離散語言的推理固有瓶頸的潛力。代碼可在https://github.com/eric-ai-lab/Soft-Thinking獲取。
English
Human cognition typically involves thinking through abstract, fluid concepts
rather than strictly using discrete linguistic tokens. Current reasoning
models, however, are constrained to reasoning within the boundaries of human
language, processing discrete token embeddings that represent fixed points in
the semantic space. This discrete constraint restricts the expressive power and
upper potential of such reasoning models, often causing incomplete exploration
of reasoning paths, as standard Chain-of-Thought (CoT) methods rely on sampling
one token per step. In this work, we introduce Soft Thinking, a training-free
method that emulates human-like "soft" reasoning by generating soft, abstract
concept tokens in a continuous concept space. These concept tokens are created
by the probability-weighted mixture of token embeddings, which form the
continuous concept space, enabling smooth transitions and richer
representations that transcend traditional discrete boundaries. In essence,
each generated concept token encapsulates multiple meanings from related
discrete tokens, implicitly exploring various reasoning paths to converge
effectively toward the correct answer. Empirical evaluations on diverse
mathematical and coding benchmarks consistently demonstrate the effectiveness
and efficiency of Soft Thinking, improving pass@1 accuracy by up to 2.48 points
while simultaneously reducing token usage by up to 22.4% compared to standard
CoT. Qualitative analysis further reveals that Soft Thinking outputs remain
highly interpretable and readable, highlighting the potential of Soft Thinking
to break the inherent bottleneck of discrete language-based reasoning. Code is
available at https://github.com/eric-ai-lab/Soft-Thinking.Summary
AI-Generated Summary