ChatPaper.aiChatPaper

狐猴:为语言代理调和自然语言和代码

Lemur: Harmonizing Natural Language and Code for Language Agents

October 10, 2023
作者: Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu
cs.AI

摘要

我们介绍了Lemur和Lemur-Chat,这是专为自然语言和编码能力优化的开放获取语言模型,旨在成为多功能语言代理的基础。从语言聊天模型发展到功能性语言代理需要模型不仅精通人类交互、推理和规划,还要确保在相关环境中具有基础。这要求模型在语言和编码能力之间取得和谐的融合。Lemur和Lemur-Chat被提出来解决这一必要性,展示了在两个领域均衡熟练的能力,与现有倾向于专门化的开源模型不同。通过在代码密集语料库上进行细致的预训练,并在文本和代码数据上进行指导微调,我们的模型在各种文本和编码基准测试中取得了业界领先的平均性能,超越了现有开源模型。全面的实验展示了Lemur相对于现有开源模型的优越性,以及其在涉及人类交流、工具使用和在完全可观察和部分可观察环境下的互动的各种代理任务中的熟练程度。自然语言和编程语言之间的协调使得Lemur-Chat在代理能力上显著缩小了与专有模型之间的差距,为开发擅长推理、规划和在环境中无缝操作的先进开源代理提供了关键见解。https://github.com/OpenLemur/Lemur
English
We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents. The evolution from language chat models to functional language agents demands that models not only master human interaction, reasoning, and planning but also ensure grounding in the relevant environments. This calls for a harmonious blend of language and coding capabilities in the models. Lemur and Lemur-Chat are proposed to address this necessity, demonstrating balanced proficiencies in both domains, unlike existing open-source models that tend to specialize in either. Through meticulous pre-training using a code-intensive corpus and instruction fine-tuning on text and code data, our models achieve state-of-the-art averaged performance across diverse text and coding benchmarks among open-source models. Comprehensive experiments demonstrate Lemur's superiority over existing open-source models and its proficiency across various agent tasks involving human communication, tool usage, and interaction under fully- and partially- observable environments. The harmonization between natural and programming languages enables Lemur-Chat to significantly narrow the gap with proprietary models on agent abilities, providing key insights into developing advanced open-source agents adept at reasoning, planning, and operating seamlessly across environments. https://github.com/OpenLemur/Lemur
PDF343December 15, 2024