ChatPaper.aiChatPaper

压缩-蒸馏:用于高效知识蒸馏的推理轨迹压缩

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

June 4, 2026
作者: Maxime Griot, Paul Steven Scotti, Tanishq Mathew Abraham
cs.AI

摘要

推理模型生成的思维链轨迹冗长,不仅蒸馏成本高昂,还容易导致学生模型输出冗余内容。本研究探索在知识蒸馏前对这些轨迹进行事后压缩。两个教师模型(Qwen3.5-397B-A17B 和 gpt-oss-120B)各生成约28.3万条正确轨迹,而后由两个指令微调模型将其压缩至原始字符长度的8.6%-21.0%。通过包含48次主网格实验及七组Qwen教师模型截断消融实验的系统评估显示:压缩轨迹可将训练令牌数降至原始文本的12%-30%,训练速度提升2.0-7.6倍,推理输出长度缩短3-19倍(其中gpt-oss教师模型的缩短幅度相对较小)。然而在各类模型规模和教师条件下,原始轨迹始终保持着最高的下游任务准确率。通过长度匹配的原始轨迹截断消融实验表明,压缩效果并非单纯受益于更小的令牌预算:模型压缩后的轨迹通常优于或持平于简单截断(尤其对较小规模的学生模型),同时保持更短的推理输出。总体而言,推理轨迹压缩呈现准确率与效率的权衡关系而非免费改进:学生模型在保留原始轨迹准确率高达96%的同时,可获得最高18倍的每令牌效率提升;在0.8B参数规模下采用LoRA方法时,压缩轨迹虽能缩小与原始轨迹的准确率差距,但始终未能超越后者。
English
Reasoning models produce long chain-of-thought traces that are costly to distill and encourage verbose student outputs. We study post-hoc compression of such traces before knowledge distillation. Two teachers, Qwen3.5-397B-A17B and gpt-oss-120B, generate about 283k correct traces each; two instruction-tuned models then compress them to 8.6-21.0% of their original character length. Across a 48-run main grid plus seven Qwen-teacher truncation ablations, compressed traces reduce training tokens to 12-30% of raw, speed up training by 2.0-7.6x, and shorten inference outputs by 3-19x with smaller reductions under the shorter gpt-oss teacher. However, raw traces retain the highest downstream accuracy at every scale and for both teachers. A length-matched raw-trace truncation ablation shows that compression is not merely benefiting from a smaller token budget: model-compressed traces usually beat or match naive truncation, especially for smaller students, while maintaining shorter inference outputs. Overall, reasoning-trace compression offers an accuracy-efficiency trade-off rather than a free improvement: students retain up to 96% of raw-trace accuracy while gaining up to 18x higher per-token efficiency, and at the 0.8B scale under LoRA compressed traces narrow the raw-vs-compressed gap but do not exceed raw.