通过代理调整语言模型

摘要

尽管大型预训练语言模型具有一般的能力，但它们始终受益于进一步调整以更好地实现期望的行为。然而，调整这些模型已变得日益资源密集，或者在模型权重是私有的情况下是不可能的。我们引入了代理调整，这是一种轻量级的解码时算法，它在黑盒语言模型的基础上运行，以实现直接调整模型的结果，但只通过访问其对输出词汇的预测。我们的方法是调整一个较小的语言模型，然后应用经过调整和未经调整的较小语言模型之间预测差异，将基础模型的原始预测朝着调整的方向移动，同时保留较大规模预训练的好处。在实验中，当我们将代理调整应用于Llama2-70B，并使用仅为7B大小的代理时，我们可以在知识、推理和安全基准测试中，将Llama2-70B与其真正调整的聊天版本之间的差距缩小88%。有趣的是，在TruthfulQA上进行测试时，代理调整模型实际上比直接调整模型更真实，可能是因为解码时的指导更好地保留了模型的事实知识。然后，我们通过将代理调整应用于代码领域适应和针对问答和数学问题的任务特定微调，展示了代理调整的普适性。我们的工作展示了使用小型调整的语言模型通过解码时指导高效定制大型、潜在专有的语言模型的潜力。

English

Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly resource-intensive, or impossible when model weights are private. We introduce proxy-tuning, a lightweight decoding-time algorithm that operates on top of black-box LMs to achieve the result of directly tuning the model, but by accessing only its prediction over the output vocabulary. Our method instead tunes a smaller LM, then applies the difference between the predictions of the small tuned and untuned LMs to shift the original predictions of the base model in the direction of tuning, while retaining the benefits of larger scale pretraining. In experiments, when we apply proxy-tuning to Llama2-70B using proxies of only 7B size, we can close 88% of the gap between Llama2-70B and its truly-tuned chat version, when evaluated across knowledge, reasoning, and safety benchmarks. Interestingly, when tested on TruthfulQA, proxy-tuned models are actually more truthful than directly tuned models, possibly because decoding-time guidance better retains the model's factual knowledge. We then demonstrate the generality of proxy-tuning by applying it for domain adaptation on code, and task-specific finetuning on question-answering and math problems. Our work demonstrates the promise of using small tuned LMs to efficiently customize large, potentially proprietary LMs through decoding-time guidance.

通过代理调整语言模型

Tuning Language Models by Proxy

摘要

Support