자기 지식 증류는 왜 (때로) 대규모 언어 모델의 추론 능력을 저하시키는가?

초록

자체 증류는 LLM의 훈련 후 패러다임으로 부상하며 추론 과정을 단축시키면서도 성능을 향상시키는 효과가 있다. 그러나 수학적 추론에서는 응답 길이를 줄이는 반면 성능을 저하시킬 수 있음을 발견했다. 이러한 성능 저하는 인식적 언어화, 즉 추론 과정에서 모델이 불확실성을 표현하는 행위가 억제되는 데서 비롯된다. 조건화 컨텍스트의 풍부함과 작업 범위를 달리한 통제 실험을 통해, 교사 모델에 풍부한 정보를 조건화할 경우 불확실성 표현이 억제되어 제한된 작업 범위 내에서는 빠른 최적화가 가능하지만, 보이지 않는 문제의 경우 불확실성을 표현하고 이에 따라 조정하는 것이 유리한 OOD 성능에서는 오히려 해가 됨을 확인했다. Qwen3-8B, DeepSeek-Distill-Qwen-7B, Olmo3-7B-Instruct 모델에서 최대 40%의 성능 하락을 관찰했다. 우리의 연구 결과는 적절한 수준의 불확실성을 노출하는 것이 강건한 추론에 필수적이며, 단순히 정답 추적을 강화하는 것을 넘어 추론 행동 자체를 최적화하는 것의 중요성을 강조한다.

English

Self-distillation has emerged as an effective post-training paradigm for LLMs, often improving performance while shortening reasoning traces. However, in mathematical reasoning, we find that it can reduce response length while degrading performance. We trace this degradation to the suppression of epistemic verbalization - the model's expression of uncertainty during reasoning. Through controlled experiments varying conditioning context richness and task coverage, we show that conditioning the teacher on rich information suppresses uncertainty expression, enabling rapid in-domain optimization with limited task coverage but harming OOD performance, where unseen problems benefit from expressing uncertainty and adjusting accordingly. Across Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct, we observe performance drops of up to 40%. Our findings highlight that exposing appropriate levels of uncertainty is crucial for robust reasoning and underscore the importance of optimizing reasoning behavior beyond merely reinforcing correct answer traces.

자기 지식 증류는 왜 (때로) 대규모 언어 모델의 추론 능력을 저하시키는가?

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

초록

Support