SemCoT:通过语义对齐隐式令牌加速思维链推理
SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
October 28, 2025
作者: Yinhan He, Wendy Zheng, Yaochen Zhu, Zaiyi Zheng, Lin Su, Sriram Vasudevan, Qi Guo, Liangjie Hong, Jundong Li
cs.AI
摘要
思维链(CoT)推理的冗长性阻碍了其在效率敏感场景中的大规模部署。近期兴起的隐式CoT方法将推理步骤编码于大语言模型的隐藏嵌入中(称为“隐式推理”),而非显式令牌。该方法通过缩短推理长度并绕过部分LLM组件来加速CoT。然而现有隐式CoT技术面临两大挑战:(1)未能保持隐式推理(转化为自然语言时)与真实推理之间的语义对齐,导致CoT性能显著下降;(2)仅关注缩短隐式推理长度,却忽略了LLM生成单个隐式推理令牌的时间成本。为应对这些挑战,我们提出新型语义对齐隐式CoT框架SemCoT。针对首个挑战,我们设计了基于对比训练的句子转换器来评估隐式与显式推理的语义对齐度,以此保障隐式推理优化过程中的语义保持。针对第二项挑战,我们通过知识蒸馏微调轻量级语言模型,构建高效隐式推理生成器。该生成器在句子转换器引导下,将真实推理蒸馏为语义对齐的隐式推理,同时优化准确性。SemCoT是首个通过联合优化令牌级生成速度与真实推理语义对齐来提升CoT效率的方法。大量实验表明,SemCoT在效率与效果上均优于现有最优方法。代码详见https://github.com/YinhanHe123/SemCoT/。
English
The verbosity of Chain-of-Thought (CoT) reasoning hinders its mass deployment
in efficiency-critical applications. Recently, implicit CoT approaches have
emerged, which encode reasoning steps within LLM's hidden embeddings (termed
``implicit reasoning'') rather than explicit tokens. This approach accelerates
CoT by reducing the reasoning length and bypassing some LLM components.
However, existing implicit CoT methods face two significant challenges: (1)
they fail to preserve the semantic alignment between the implicit reasoning
(when transformed to natural language) and the ground-truth reasoning,
resulting in a significant CoT performance degradation, and (2) they focus on
reducing the length of the implicit reasoning; however, they neglect the
considerable time cost for an LLM to generate one individual implicit reasoning
token. To tackle these challenges, we propose a novel semantically-aligned
implicit CoT framework termed SemCoT. In particular, for the first challenge,
we design a contrastively trained sentence transformer that evaluates semantic
alignment between implicit and explicit reasoning, which is used to enforce
semantic preservation during implicit reasoning optimization. To address the
second challenge, we introduce an efficient implicit reasoning generator by
finetuning a lightweight language model using knowledge distillation. This
generator is guided by our sentence transformer to distill ground-truth
reasoning into semantically aligned implicit reasoning, while also optimizing
for accuracy. SemCoT is the first approach that enhances CoT efficiency by
jointly optimizing token-level generation speed and preserving semantic
alignment with ground-truth reasoning. Extensive experiments demonstrate the
superior performance of SemCoT compared to state-of-the-art methods in both
efficiency and effectiveness. Our code can be found at
https://github.com/YinhanHe123/SemCoT/.