ChatPaper.aiChatPaper

OpenDevin:一个面向AI软件开发者的开放平台,作为通用智能体。

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

July 23, 2024
作者: Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
cs.AI

摘要

软件是我们人类手中最强大的工具之一;它使熟练的程序员能够以复杂和深远的方式与世界互动。与此同时,由于大型语言模型(LLMs)的改进,人工智能代理也迅速发展,这些代理与周围环境互动并产生影响。在本文中,我们介绍了OpenDevin,这是一个用于开发强大而灵活的人工智能代理的平台,这些代理与人类开发者类似地与世界互动:编写代码、与命令行交互和浏览网页。我们描述了该平台如何实现新代理、与沙盒环境安全互动以进行代码执行、协调多个代理之间的交互,并整合评估基准。基于我们目前整合的基准,我们对超过15项具有挑战性的任务进行了代理评估,包括软件工程(例如SWE-Bench)和网络浏览(例如WebArena)等。OpenDevin采用宽松的MIT许可证发布,是一个跨学术界和工业界的社区项目,拥有来自160多位贡献者的超过1.3K次贡献,并将不断改进。
English
Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to improvements in large language models (LLMs), there has also been a rapid development in AI agents that interact with and affect change in their surrounding environments. In this paper, we introduce OpenDevin, a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web. We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, coordination between multiple agents, and incorporation of evaluation benchmarks. Based on our currently incorporated benchmarks, we perform an evaluation of agents over 15 challenging tasks, including software engineering (e.g., SWE-Bench) and web browsing (e.g., WebArena), among others. Released under the permissive MIT license, OpenDevin is a community project spanning academia and industry with more than 1.3K contributions from over 160 contributors and will improve going forward.

Summary

AI-Generated Summary

PDF725November 28, 2024