事前学習済みモデルを捨てないで

要旨

アライメントトレーニングにはトレードオフが存在します。それは、言語モデル（LM）が推論や指示追従の能力を向上させる一方で、創造性やキャリブレーションといったスキルにおいては、非アライメントされたベースモデルの方が優れている可能性があるからです。私たちは、モデル間の協調を通じて、両方の世界の長所を最大限に活用することを目指しています。ここでは、トレーニングパイプライン内の異なるモデルが協力し、互いを補完します。LMの応答は、異なるモデルに適したスキルが交互に現れる特徴を持つため、事前学習済みモデルとアライメントされたモデルのバージョンが応答シーケンス内で交互に「発言」する「スイッチ生成」を提案します。具体的には、多様なクエリとコンテキストにおいて、次のセグメントを生成するために異なるモデルを選択した結果から学習することで、スイッチャーLMをトレーニングします。推論時には、スイッチャーLMが異なるモデルチェックポイントを導き、その強みが最も必要とされる場所で次のセグメントを動的に生成します。8つのモデル協調ベースラインと18のデータセットを用いた広範な実験により、1) モデル協調は18のタスクのうち16において個々のモデルを一貫して上回り、2) スイッチ生成はベースラインを平均12.9%上回ることが示されました。さらに分析を行った結果、スイッチ生成は、個々のモデルが苦手とする問題を解決するための合成的スキルを発見し、未見のモデルやタスクに一般化することが明らかになりました。これにより、高価なモデルトレーニングパイプラインで生成される副産物を再利用・再活用することが可能となり、従来は廃棄されていたものを有効に活用しています。

English

Alignment training has tradeoffs: it helps language models (LMs) gain in reasoning and instruction following but might lose out on skills such as creativity and calibration, where unaligned base models are better at. We aim to make the best of both worlds through model collaboration, where different models in the training pipeline collaborate and complement each other. Since LM responses feature interleaving skills that favor different models, we propose Switch Generation, where pretrained and aligned model versions take turns to ``speak'' in a response sequence. Specifically, we train a switcher LM by learning from outcomes of choosing different models to generate the next segment across diverse queries and contexts. At inference time, the switcher LM guides different model checkpoints to dynamically generate the next segment where their strengths are most needed. Extensive experiments with 8 model collaboration baselines and 18 datasets show that 1) model collaboration consistently outperforms individual models on 16 out of 18 tasks, and 2) Switch Generation further outperforms baselines by 12.9% on average. Further analysis reveals that Switch Generation discovers compositional skills to solve problems where individual models struggle and generalizes to unseen models and tasks, reusing and repurposing by-products in expensive model training pipelines that are otherwise discarded.

事前学習済みモデルを捨てないで

Don't Throw Away Your Pretrained Model

要旨

Support