ChatPaper.aiChatPaper

混合式代理模型增強大型語言模型的能力

Mixture-of-Agents Enhances Large Language Model Capabilities

June 7, 2024
作者: Junlin Wang, Jue Wang, Ben Athiwaratkun, Ce Zhang, James Zou
cs.AI

摘要

最近對大型語言模型(LLMs)的進展展示了在自然語言理解和生成任務中的顯著能力。隨著LLMs數量的增加,如何利用多個LLMs的集體專業知識是一個令人興奮的開放方向。為了實現這一目標,我們提出了一種新方法,通過混合式代理(MoA)方法利用多個LLMs的集體優勢。在我們的方法中,我們構建了一個分層MoA架構,其中每一層包含多個LLM代理。每個代理將前一層代理的所有輸出作為輔助信息,用於生成其回應。MoA模型在AlpacaEval 2.0、MT-Bench和FLASK上實現了最先進的性能,超越了GPT-4 Omni。例如,我們的MoA僅使用開源LLMs,在AlpacaEval 2.0中領先GPT-4 Omni很大差距,獲得了65.1%的得分,而GPT-4 Omni則為57.5%。
English
Recent advances in large language models (LLMs) demonstrate substantial capabilities in natural language understanding and generation tasks. With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) methodology. In our approach, we construct a layered MoA architecture wherein each layer comprises multiple LLM agents. Each agent takes all the outputs from agents in the previous layer as auxiliary information in generating its response. MoA models achieves state-of-art performance on AlpacaEval 2.0, MT-Bench and FLASK, surpassing GPT-4 Omni. For example, our MoA using only open-source LLMs is the leader of AlpacaEval 2.0 by a substantial gap, achieving a score of 65.1% compared to 57.5% by GPT-4 Omni.

Summary

AI-Generated Summary

PDF603December 8, 2024