為何自我蒸餾（有時）會降低大型語言模型的推理能力？

摘要

自我蒸餾已成為大型語言模型有效的後訓練範式，通常能在縮短推理鏈的同時提升效能。然而在數學推理領域中，我們發現這種方法雖能縮短回應長度，卻可能導致效能下降。經追蹤發現，效能衰退源自於認知性言語表達的抑制——即模型在推理過程中對不確定性的表述能力受限。透過調控情境脈絡豐富度與任務覆蓋率的對照實驗，我們發現當教師模型基於過度豐富的資訊進行條件化時，會壓制不確定性表達：這雖能讓模型在有限任務覆蓋下實現快速的領域內優化，卻會損害其分布外泛化效能——因為未見過的問題本可透過表達不確定性並動態調整來提升表現。在Qwen3-8B、DeepSeek-Distill-Qwen-7B與Olmo3-7B-Instruct等模型上的實驗顯示，效能降幅最高可達40%。我們的研究結果表明，暴露適度不確定性對於實現穩健推理至關重要，同時強調優化推理行為不應僅止於強化正確答案軌跡，更需注重推理過程的動態調節能力。

English

Self-distillation has emerged as an effective post-training paradigm for LLMs, often improving performance while shortening reasoning traces. However, in mathematical reasoning, we find that it can reduce response length while degrading performance. We trace this degradation to the suppression of epistemic verbalization - the model's expression of uncertainty during reasoning. Through controlled experiments varying conditioning context richness and task coverage, we show that conditioning the teacher on rich information suppresses uncertainty expression, enabling rapid in-domain optimization with limited task coverage but harming OOD performance, where unseen problems benefit from expressing uncertainty and adjusting accordingly. Across Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct, we observe performance drops of up to 40%. Our findings highlight that exposing appropriate levels of uncertainty is crucial for robust reasoning and underscore the importance of optimizing reasoning behavior beyond merely reinforcing correct answer traces.

為何自我蒸餾（有時）會降低大型語言模型的推理能力？

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

摘要

Support