大型語言模型的黑箱同策略蒸餾
Black-Box On-Policy Distillation of Large Language Models
November 13, 2025
作者: Tianzhu Ye, Li Dong, Zewen Chi, Xun Wu, Shaohan Huang, Furu Wei
cs.AI
摘要
黑箱蒸餻技術僅透過學習專有教師模型的文本輸出,即可建立學生大型語言模型(LLM),無需存取其內部邏輯值或參數。本研究提出生成對抗蒸餻法(GAD),實現了策略上線與黑箱蒸餻。GAD將學生LLM視為生成器,並訓練判別器區分其回應與教師LLM的回應,形成最小最大化博弈。該判別器作為與學生共同演進的策略上線獎勵模型,提供穩定且自適應的反饋。實驗結果顯示,GAD持續超越常用的序列層級知識蒸餻方法。特別是以GAD訓練的Qwen2.5-14B-Instruct(學生模型)在LMSYS-Chat自動評估中,表現可與其教師模型GPT-5-Chat相媲美。此成果確立GAD作為黑箱LLM蒸餻領域具前瞻性且高效的新範式。
English
Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work, we introduce Generative Adversarial Distillation (GAD), which enables on-policy and black-box distillation. GAD frames the student LLM as a generator and trains a discriminator to distinguish its responses from the teacher LLM's, creating a minimax game. The discriminator acts as an on-policy reward model that co-evolves with the student, providing stable, adaptive feedback. Experimental results show that GAD consistently surpasses the commonly used sequence-level knowledge distillation. In particular, Qwen2.5-14B-Instruct (student) trained with GAD becomes comparable to its teacher, GPT-5-Chat, on the LMSYS-Chat automatic evaluation. The results establish GAD as a promising and effective paradigm for black-box LLM distillation.