多语言模型协作解码的学习

摘要

我们提出了一种方法，通过在标记级别交替生成，教授多个大型语言模型（LLM）进行协作。我们将决定哪个LLM生成下一个标记的过程建模为一个潜变量。通过在我们的潜变量模型下优化训练集的边际似然，基础LLM会自动学习何时生成自身以及何时调用其中一个“助手”语言模型进行生成，而无需直接监督。在解码过程中进行标记级别的协作允许以适合特定任务的方式融合每个模型的专业知识。我们的协作解码在跨领域设置中特别有用，其中通用基础LLM学会调用领域专家模型。在遵循指令、领域特定问答和推理任务中，我们展示了联合系统的性能超过了单个模型。通过对学习到的潜在决策进行定性分析，我们展示了使用我们方法训练的模型表现出多种有趣的协作模式，例如模板填充。我们的代码可在https://github.com/clinicalml/co-llm 上找到。

English

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the marginal likelihood of a training set under our latent variable model, the base LLM automatically learns when to generate itself and when to call on one of the ``assistant'' language models to generate, all without direct supervision. Token-level collaboration during decoding allows for a fusion of each model's expertise in a manner tailored to the specific task at hand. Our collaborative decoding is especially useful in cross-domain settings where a generalist base LLM learns to invoke domain expert models. On instruction-following, domain-specific QA, and reasoning tasks, we show that the performance of the joint system exceeds that of the individual models. Through qualitative analysis of the learned latent decisions, we show models trained with our method exhibit several interesting collaboration patterns, e.g., template-filling. Our code is available at https://github.com/clinicalml/co-llm.

多语言模型协作解码的学习

Learning to Decode Collaboratively with Multiple Language Models

摘要

Support