言う前に知る：LLMの表現は、完了前に連鎖的思考の成功に関する情報を符号化する

要旨

ゼロショットChain-of-Thought（CoT）プロセスの成功が完了前に予測可能かどうかを調査します。LLMの表現に基づくプロービング分類器が、単一のトークンが生成される前でも良好な性能を発揮することを発見しました。これは、推論プロセスに関する重要な情報が初期段階の表現に既に含まれていることを示唆しています。一方、生成されたトークンのみに依存する強力なBERTベースのベースラインは、より浅い言語的キューに依存しているため、性能が劣ります。驚くべきことに、後の推論ステップを使用しても分類が必ずしも改善されるわけではありません。追加のコンテキストが役に立たない場合、初期の表現は後の表現に似ており、LLMが重要な情報を早期にエンコードしていることを示唆しています。これは、推論が早期に停止しても損失がないことを意味します。これをテストするために、早期停止実験を実施し、CoT推論を途中で打ち切っても、CoTを全く使用しない場合よりも性能が向上することを示しましたが、完全な推論との間には依然としてギャップが残ります。しかし、CoTチェーンを短縮するために設計された教師あり学習や強化学習などのアプローチは、早期停止が効果的である時期を特定するために、我々の分類器のガイダンスを活用できる可能性があります。我々の知見は、そのような方法を支援し、CoTの効率を最適化しながらその利点を維持するための洞察を提供します。

English

We investigate whether the success of a zero-shot Chain-of-Thought (CoT) process can be predicted before completion. We discover that a probing classifier, based on LLM representations, performs well even before a single token is generated, suggesting that crucial information about the reasoning process is already present in the initial steps representations. In contrast, a strong BERT-based baseline, which relies solely on the generated tokens, performs worse, likely because it depends on shallow linguistic cues rather than deeper reasoning dynamics. Surprisingly, using later reasoning steps does not always improve classification. When additional context is unhelpful, earlier representations resemble later ones more, suggesting LLMs encode key information early. This implies reasoning can often stop early without loss. To test this, we conduct early stopping experiments, showing that truncating CoT reasoning still improves performance over not using CoT at all, though a gap remains compared to full reasoning. However, approaches like supervised learning or reinforcement learning designed to shorten CoT chains could leverage our classifier's guidance to identify when early stopping is effective. Our findings provide insights that may support such methods, helping to optimize CoT's efficiency while preserving its benefits.

言う前に知る：LLMの表現は、完了前に連鎖的思考の成功に関する情報を符号化する

Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion

要旨

Support