複数の言語モデルを用いた協調的デコーディングの学習

要旨

我々は、複数の大規模言語モデル（LLM）がトークンレベルで生成を交互に行うことで協調する方法を提案する。どのLLMが次のトークンを生成するかの決定を潜在変数としてモデル化する。潜在変数モデルの下で訓練セットの周辺尤度を最適化することにより、ベースLLMは、直接的な監督なしに、自身が生成すべき時と「アシスタント」言語モデルを呼び出すべき時を自動的に学習する。デコード中のトークンレベルの協調により、各モデルの専門知識を特定のタスクに合わせて融合することが可能となる。我々の協調デコードは、汎用ベースLLMがドメイン専門家モデルを呼び出すことを学習するクロスドメイン設定で特に有用である。指示追従、ドメイン固有のQA、および推論タスクにおいて、共同システムの性能が個々のモデルを上回ることを示す。学習された潜在決定の質的分析を通じて、我々の方法で訓練されたモデルが、テンプレート埋め込みなど、いくつかの興味深い協調パターンを示すことを明らかにする。コードはhttps://github.com/clinicalml/co-llmで公開されている。

English

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the marginal likelihood of a training set under our latent variable model, the base LLM automatically learns when to generate itself and when to call on one of the ``assistant'' language models to generate, all without direct supervision. Token-level collaboration during decoding allows for a fusion of each model's expertise in a manner tailored to the specific task at hand. Our collaborative decoding is especially useful in cross-domain settings where a generalist base LLM learns to invoke domain expert models. On instruction-following, domain-specific QA, and reasoning tasks, we show that the performance of the joint system exceeds that of the individual models. Through qualitative analysis of the learned latent decisions, we show models trained with our method exhibit several interesting collaboration patterns, e.g., template-filling. Our code is available at https://github.com/clinicalml/co-llm.

複数の言語モデルを用いた協調的デコーディングの学習

Learning to Decode Collaboratively with Multiple Language Models

要旨

Support