ChatPaper.aiChatPaper

AgentTuning:為LLM實現通用代理能力

AgentTuning: Enabling Generalized Agent Abilities for LLMs

October 19, 2023
作者: Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong, Jie Tang
cs.AI

摘要

開放式大型語言模型(LLMs)在各種任務中表現出色,顯著推動了LLMs的發展。然而,當作為應對現實世界複雜任務的代理時,它們遠遠不及商業模型如ChatGPT和GPT-4。這些代理任務將LLMs作為負責規劃、記憶和工具利用的中央控制器,需要精細的提示方法和強大的LLMs以達到滿意的表現。儘管已提出許多提示方法來完成特定的代理任務,但缺乏專注於提升LLMs自身代理能力而不損害其一般能力的研究。在這項工作中,我們提出AgentTuning,這是一種簡單通用的方法,可增強LLMs的代理能力,同時保持其一般LLM能力。我們構建了AgentInstruct,一個包含高質量互動軌跡的輕量級指令調整數據集。我們採用混合指令調整策略,將AgentInstruct與來自一般領域的開源指令相結合。AgentTuning用於指令調整Llama 2系列,產生AgentLM。我們的評估顯示,AgentTuning使LLMs的代理能力得到提升,而不損害其一般能力。AgentLM-70B在未知的代理任務上與GPT-3.5-turbo相當,展示了通用的代理能力。我們在https://github.com/THUDM/AgentTuning 開源了AgentInstruct和AgentLM-7B、13B和70B模型,為代理任務提供了開放且強大的替代方案。
English
Open large language models (LLMs) with great performance in various tasks have significantly advanced the development of LLMs. However, they are far inferior to commercial models such as ChatGPT and GPT-4 when acting as agents to tackle complex tasks in the real world. These agent tasks employ LLMs as the central controller responsible for planning, memorization, and tool utilization, necessitating both fine-grained prompting methods and robust LLMs to achieve satisfactory performance. Though many prompting methods have been proposed to complete particular agent tasks, there is lack of research focusing on improving the agent capabilities of LLMs themselves without compromising their general abilities. In this work, we present AgentTuning, a simple and general method to enhance the agent abilities of LLMs while maintaining their general LLM capabilities. We construct AgentInstruct, a lightweight instruction-tuning dataset containing high-quality interaction trajectories. We employ a hybrid instruction-tuning strategy by combining AgentInstruct with open-source instructions from general domains. AgentTuning is used to instruction-tune the Llama 2 series, resulting in AgentLM. Our evaluations show that AgentTuning enables LLMs' agent capabilities without compromising general abilities. The AgentLM-70B is comparable to GPT-3.5-turbo on unseen agent tasks, demonstrating generalized agent capabilities. We open source the AgentInstruct and AgentLM-7B, 13B, and 70B models at https://github.com/THUDM/AgentTuning , serving open and powerful alternatives to commercial LLMs for agent tasks.
PDF361December 15, 2024