言前先知：大語言模型表徵在完成前即編碼思維鏈成功信息

摘要

我們探討是否能在零樣本思維鏈（CoT）過程完成之前預測其成功與否。我們發現，基於大型語言模型（LLM）表徵的探測分類器，在尚未生成任何單一詞元時便已表現出色，這表明推理過程的關鍵信息已存在於初始步驟的表徵中。相比之下，僅依賴生成詞元的強力BERT基線模型表現較差，可能是因為其依賴於淺層的語言線索而非深層的推理動態。令人驚訝的是，使用後續推理步驟並不總能提升分類效果。當額外上下文無助時，早期表徵與後期表徵更為相似，這表明LLM在早期便已編碼關鍵信息。這意味著推理往往可以提前停止而不損失效果。為驗證這一點，我們進行了早期停止實驗，結果顯示，即便截斷CoT推理，其性能仍優於完全不使用CoT，儘管與完整推理相比仍存在差距。然而，旨在縮短CoT鏈的監督學習或強化學習等方法，可借助我們分類器的指導來識別何時早期停止是有效的。我們的研究發現為支持此類方法提供了洞見，有助於在保持CoT優勢的同時優化其效率。

English

We investigate whether the success of a zero-shot Chain-of-Thought (CoT) process can be predicted before completion. We discover that a probing classifier, based on LLM representations, performs well even before a single token is generated, suggesting that crucial information about the reasoning process is already present in the initial steps representations. In contrast, a strong BERT-based baseline, which relies solely on the generated tokens, performs worse, likely because it depends on shallow linguistic cues rather than deeper reasoning dynamics. Surprisingly, using later reasoning steps does not always improve classification. When additional context is unhelpful, earlier representations resemble later ones more, suggesting LLMs encode key information early. This implies reasoning can often stop early without loss. To test this, we conduct early stopping experiments, showing that truncating CoT reasoning still improves performance over not using CoT at all, though a gap remains compared to full reasoning. However, approaches like supervised learning or reinforcement learning designed to shorten CoT chains could leverage our classifier's guidance to identify when early stopping is effective. Our findings provide insights that may support such methods, helping to optimize CoT's efficiency while preserving its benefits.

言前先知：大語言模型表徵在完成前即編碼思維鏈成功信息

Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion

摘要

Support