透過代理調整語言模型

摘要

儘管大型預訓練語言模型具有一般性能，但它們通常需要進一步調整以更好地實現所需的行為。然而，調整這些模型已變得日益資源密集，或在模型權重為私有時可能無法實現。我們引入了代理調整(proxy-tuning)這一輕量級解碼時間算法，它在黑盒語言模型的基礎上運行，以實現直接調整模型的結果，但僅通過訪問其對輸出詞彙的預測。我們的方法是調整一個較小的語言模型，然後應用調整後的小模型和未調整模型之間預測的差異，將基礎模型的原始預測朝著調整的方向進行調整，同時保留較大規模預訓練的好處。在實驗中，當我們將代理調整應用於Llama2-70B，並使用僅7B大小的代理時，我們可以在知識、推理和安全基準測試中，將Llama2-70B與其真正調整的對話版本之間的差距縮小88%。有趣的是，當在TruthfulQA上進行測試時，代理調整模型實際上比直接調整模型更真實，可能是因為解碼時間的引導更好地保留了模型的事實知識。然後，我們通過將其應用於代碼的領域適應和應用於問答和數學問題的任務特定微調，展示了代理調整的普遍性。我們的工作展示了使用小型調整的語言模型通過解碼時間引導高效定製大型、可能專有的語言模型的潛力。

English

Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly resource-intensive, or impossible when model weights are private. We introduce proxy-tuning, a lightweight decoding-time algorithm that operates on top of black-box LMs to achieve the result of directly tuning the model, but by accessing only its prediction over the output vocabulary. Our method instead tunes a smaller LM, then applies the difference between the predictions of the small tuned and untuned LMs to shift the original predictions of the base model in the direction of tuning, while retaining the benefits of larger scale pretraining. In experiments, when we apply proxy-tuning to Llama2-70B using proxies of only 7B size, we can close 88% of the gap between Llama2-70B and its truly-tuned chat version, when evaluated across knowledge, reasoning, and safety benchmarks. Interestingly, when tested on TruthfulQA, proxy-tuned models are actually more truthful than directly tuned models, possibly because decoding-time guidance better retains the model's factual knowledge. We then demonstrate the generality of proxy-tuning by applying it for domain adaptation on code, and task-specific finetuning on question-answering and math problems. Our work demonstrates the promise of using small tuned LMs to efficiently customize large, potentially proprietary LMs through decoding-time guidance.

透過代理調整語言模型

Tuning Language Models by Proxy

摘要

Support