UniSD：大規模言語モデルのための統一的自己蒸留フレームワークに向けて

要旨

自己蒸留（SD）は、強力な外部教師に依存せずに大規模言語モデル（LLM）を適応させる有望な手法である。しかし、自己回帰型LLMにおけるSDは、自己生成された軌跡が自由形式であり、正しさがタスクに依存し、もっともらしい根拠が不安定または信頼性の低い監督をもたらす可能性があるため、依然として課題が多い。既存の手法は主に個別の設計選択を検討しており、その有効性、役割、相互作用は不明確である。本稿では、自己蒸留を体系的に研究するための統一フレームワークUniSDを提案する。UniSDは、監督の信頼性、表現の整合性、訓練の安定性に対処する相補的なメカニズム、すなわち複数教師の一致、EMA教師による安定化、トークンレベルの対照学習、特徴量マッチング、発散クリッピングを統合する。6つのベンチマークと3つのモデルファミリーに属する6つのモデルにわたる評価により、UniSDは自己蒸留が静的な模倣よりも改善される条件、利得をもたらす主要なコンポーネント、そしてそれらがタスク間でどのように相互作用するかを明らかにする。これらの知見に基づき、相補的なコンポーネントを組み合わせた統合パイプラインUniSDfullを構築し、ベースモデル比+5.4ポイント、最強ベースライン比+2.8ポイントの改善を達成する、最も強力な全体性能を実現する。広範な評価により、自己蒸留は強力な外部教師なしでもLLMを効率的に適応させる実用的で制御可能な手法であることが示される。

English

Self-distillation (SD) offers a promising path for adapting large language models (LLMs) without relying on stronger external teachers. However, SD in autoregressive LLMs remains challenging because self-generated trajectories are free-form, correctness is task-dependent, and plausible rationales can still provide unstable or unreliable supervision. Existing methods mainly examine isolated design choices, leaving their effectiveness, roles, and interactions unclear. In this paper, we propose UniSD, a unified framework to systematically study self-distillation. UniSD integrates complementary mechanisms that address supervision reliability, representation alignment, and training stability, including multi-teacher agreement, EMA teacher stabilization, token-level contrastive learning, feature matching, and divergence clipping. Across six benchmarks and six models from three model families, UniSD reveals when self-distillation improves over static imitation, which components drive the gains, and how these components interact across tasks. Guided by these insights, we construct UniSDfull, an integrated pipeline that combines complementary components and achieves the strongest overall performance, improving over the base model by +5.4 points and the strongest baseline by +2.8 points. Extensive evaluation highlights self-distillation as a practical and steerable approach for efficient LLM adaptation without stronger external teachers.

UniSD：大規模言語モデルのための統一的自己蒸留フレームワークに向けて

UniSD: Towards a Unified Self-Distillation Framework for Large Language Models

要旨

Support