야생 환경에서 자율 에이전트의 기여도 분석: 활동 패턴과 코드 변경 이력의 시간적 변화

초록

코드용 대규모 언어 모델의 부상은 소프트웨어 개발 환경을 재편하고 있습니다. 브랜치 생성, 풀 리퀘스트 오픈, 코드 리뷰 수행이 가능한 자율 코딩 에이전트가 이제 실제 프로젝트에 능동적으로 기여하고 있습니다. 이들의 역할 확대는 AI 기반 기여도와 코드 품질, 팀 역학, 소프트웨어 유지보수성에 미치는 영향을 연구할 수 있는 독특하고 시의적절한 기회를 제공합니다. 본 연구에서는 약 110,000건의 오픈소스 풀 리퀘스트와 관련 커밋, 코멘트, 리뷰, 이슈, 파일 변경 사항으로 구성된 새로운 데이터셋을 구축하며, 이는 총 수백만 줄의 소스 코드를 대표합니다. 우리는 OpenAI Codex, Claude Code, GitHub Copilot, Google Jules, Devin 등 5가지 인기 코딩 에이전트를 비교하여 병합 빈도, 수정 파일 유형, 코멘트 및 리뷰와 같은 개발자 상호작신 신호 등 다양한 개발 측면에서 이들의 사용 패턴이 어떻게 차별화되는지 분석합니다. 나아가 코드 작성과 리뷰는 더 광범위한 소프트웨어 엔지니어링 프로세스의 일부에 불과하며, 결과물인 코드는 시간이 지나도 유지보수 및 업데이트가 가능해야 함을 강조합니다. 따라서 우리는 에이전트 생성 코드와 인간 작성 코드의 생존률 및 변경률에 대한 여러 종단적 추정치를 제시합니다. 궁극적으로 우리의 연구 결과는 오픈소스 프로젝트에서 에이전트 활동이 증가하는 추세를 보이지만, 그들의 기여는 인간 작성 코드 대비 시간이 지남에 따라 더 많은 변경과 연관되어 있음을 시사합니다.

English

The rise of large language models for code has reshaped software development. Autonomous coding agents, able to create branches, open pull requests, and perform code reviews, now actively contribute to real-world projects. Their growing role offers a unique and timely opportunity to investigate AI-driven contributions and their effects on code quality, team dynamics, and software maintainability. In this work, we construct a novel dataset of approximately 110,000 open-source pull requests, including associated commits, comments, reviews, issues, and file changes, collectively representing millions of lines of source code. We compare five popular coding agents, including OpenAI Codex, Claude Code, GitHub Copilot, Google Jules, and Devin, examining how their usage differs in various development aspects such as merge frequency, edited file types, and developer interaction signals, including comments and reviews. Furthermore, we emphasize that code authoring and review are only a small part of the larger software engineering process, as the resulting code must also be maintained and updated over time. Hence, we offer several longitudinal estimates of survival and churn rates for agent-generated versus human-authored code. Ultimately, our findings indicate an increasing agent activity in open-source projects, although their contributions are associated with more churn over time compared to human-authored code.

야생 환경에서 자율 에이전트의 기여도 분석: 활동 패턴과 코드 변경 이력의 시간적 변화

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

초록

Support