다중 언어 모델과의 협력적 디코딩 학습

초록

우리는 여러 대형 언어 모델(LLM)이 토큰 수준에서 생성 작업을 교차하며 협업하도록 가르치는 방법을 제안합니다. 다음 토큰을 생성할 LLM을 결정하는 문제를 잠재 변수로 모델링합니다. 잠재 변수 모델 하에서 훈련 세트의 주변 우도를 최적화함으로써, 기본 LLM은 직접적인 지도 없이도 언제 스스로 생성하고 언제 "보조" 언어 모델 중 하나를 호출하여 생성할지를 자동으로 학습합니다. 디코딩 과정에서의 토큰 수준 협업은 각 모델의 전문성을 특정 작업에 맞게 융합할 수 있게 합니다. 우리의 협업 디코딩은 특히 일반적인 기본 LLM이 도메인 전문가 모델을 호출하는 방법을 학습하는 크로스 도메인 설정에서 유용합니다. 지시 따르기, 도메인 특화 질의응답, 추론 작업에서 우리는 공동 시스템의 성능이 개별 모델의 성능을 능가함을 보여줍니다. 학습된 잠재 결정에 대한 질적 분석을 통해, 우리의 방법으로 훈련된 모델이 템플릿 채우기와 같은 여러 흥미로운 협업 패턴을 보임을 확인합니다. 우리의 코드는 https://github.com/clinicalml/co-llm에서 확인할 수 있습니다.

English

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the marginal likelihood of a training set under our latent variable model, the base LLM automatically learns when to generate itself and when to call on one of the ``assistant'' language models to generate, all without direct supervision. Token-level collaboration during decoding allows for a fusion of each model's expertise in a manner tailored to the specific task at hand. Our collaborative decoding is especially useful in cross-domain settings where a generalist base LLM learns to invoke domain expert models. On instruction-following, domain-specific QA, and reasoning tasks, we show that the performance of the joint system exceeds that of the individual models. Through qualitative analysis of the learned latent decisions, we show models trained with our method exhibit several interesting collaboration patterns, e.g., template-filling. Our code is available at https://github.com/clinicalml/co-llm.

다중 언어 모델과의 협력적 디코딩 학습

Learning to Decode Collaboratively with Multiple Language Models

초록

Support