FullStack-Agent：通过面向开发的测试与仓库反向翻译增强智能体全栈Web编程能力

Abstract

Assistere gli utenti non esperti nello sviluppo di siti web interattivi complessi è diventato un compito popolare per gli agenti di codice basati su LLM. Tuttavia, gli agenti di codice esistenti tendono a generare solo pagine web frontend, mascherando la mancanza di una reale elaborazione e memorizzazione dei dati full-stack con effetti visivi appariscenti. È importante notare che la costruzione di applicazioni web full-stack di livello production è di gran lunga più impegnativa della sola generazione di pagine web frontend, poiché richiede un attento controllo del flusso dei dati, una comprensione completa di pacchetti e dipendenze in costante aggiornamento e una precisa localizzazione di bug oscuri nella codebase. Per affrontare queste difficoltà, introduciamo FullStack-Agent, un sistema agentico unificato per la programmazione agentica full-stack che consiste in tre parti: (1) FullStack-Dev, un framework multi-agente con forti capacità di pianificazione, modifica del codice, navigazione della codebase e localizzazione dei bug. (2) FullStack-Learn, un metodo innovativo di scalabilità dei dati e auto-miglioramento che retro-traduce repository di siti web crawlate e sintetizzate per migliorare l'LLM backbone di FullStack-Dev. (3) FullStack-Bench, un benchmark completo che testa sistematicamente le funzionalità frontend, backend e di database del sito web generato. Il nostro FullStack-Dev supera il precedente metodo state-of-the-art rispettivamente dell'8,7%, 38,2% e 15,9% sui test case frontend, backend e di database. Inoltre, FullStack-Learn aumenta le prestazioni di un modello da 30B del 9,7%, 9,5% e 2,8% sulle tre serie di test case attraverso l'auto-miglioramento, dimostrando l'efficacia del nostro approccio. Il codice è rilasciato su https://github.com/mnluzimu/FullStack-Agent.

English

Assisting non-expert users to develop complex interactive websites has become a popular task for LLM-powered code agents. However, existing code agents tend to only generate frontend web pages, masking the lack of real full-stack data processing and storage with fancy visual effects. Notably, constructing production-level full-stack web applications is far more challenging than only generating frontend web pages, demanding careful control of data flow, comprehensive understanding of constantly updating packages and dependencies, and accurate localization of obscure bugs in the codebase. To address these difficulties, we introduce FullStack-Agent, a unified agent system for full-stack agentic coding that consists of three parts: (1) FullStack-Dev, a multi-agent framework with strong planning, code editing, codebase navigation, and bug localization abilities. (2) FullStack-Learn, an innovative data-scaling and self-improving method that back-translates crawled and synthesized website repositories to improve the backbone LLM of FullStack-Dev. (3) FullStack-Bench, a comprehensive benchmark that systematically tests the frontend, backend and database functionalities of the generated website. Our FullStack-Dev outperforms the previous state-of-the-art method by 8.7%, 38.2%, and 15.9% on the frontend, backend, and database test cases respectively. Additionally, FullStack-Learn raises the performance of a 30B model by 9.7%, 9.5%, and 2.8% on the three sets of test cases through self-improvement, demonstrating the effectiveness of our approach. The code is released at https://github.com/mnluzimu/FullStack-Agent.

FullStack-Agent：通过面向开发的测试与仓库反向翻译增强智能体全栈Web编程能力

FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation

Abstract

Support