ChatPaper.aiChatPaper

OpenDevin:一個針對AI軟體開發者的開放平台,作為通用代理人。

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

July 23, 2024
作者: Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
cs.AI

摘要

軟體是我們人類手中最強大的工具之一;它讓熟練的程式設計師能以複雜而深遠的方式與世界互動。與此同時,由於大型語言模型(LLMs)的改進,人工智慧代理也有了快速發展,這些代理可以與周圍環境互動並產生影響。在本文中,我們介紹了OpenDevin,這是一個用於開發功能強大且靈活的人工智慧代理的平台,這些代理與人類開發者類似地通過編寫程式碼、與命令列互動和瀏覽網頁來與世界互動。我們描述了這個平台如何實現新代理、與用於程式碼執行的沙箱環境進行安全互動、協調多個代理之間的互動以及整合評估基準。基於我們目前整合的基準,我們對超過15個具有挑戰性的任務進行了代理評估,包括軟體工程(例如SWE-Bench)和網頁瀏覽(例如WebArena)等。OpenDevin以寬鬆的MIT許可證發布,是一個跨越學術界和工業界的社區項目,有來自160多位貢獻者的超過1.3K次貢獻,並將不斷改進。
English
Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to improvements in large language models (LLMs), there has also been a rapid development in AI agents that interact with and affect change in their surrounding environments. In this paper, we introduce OpenDevin, a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web. We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, coordination between multiple agents, and incorporation of evaluation benchmarks. Based on our currently incorporated benchmarks, we perform an evaluation of agents over 15 challenging tasks, including software engineering (e.g., SWE-Bench) and web browsing (e.g., WebArena), among others. Released under the permissive MIT license, OpenDevin is a community project spanning academia and industry with more than 1.3K contributions from over 160 contributors and will improve going forward.

Summary

AI-Generated Summary

PDF725November 28, 2024