QiMeng-PRepair: 편집 인식 보상 최적화를 통한 정밀 코드 수리

초록

대규모 언어 모델(LLM)은 강력한 프로그램 수정 성능을 달성하지만, 과도한 수정으로 정확한 코드를 덮어쓰고 버그 위치 특정을 방해하는 과잉 수정(over-editing) 문제가 자주 발생합니다. 우리는 이 문제의 영향을 체계적으로 정량화하고, 오류가 있는 부분만 수정하면서 정확한 코드의 재사용을 극대화하는 정밀 수정(precise repair) 작업을 소개합니다. 이러한 통찰을 바탕으로 과잉 수정을 완화하고 수정 정확도를 향상시키는 PRepair 프레임워크를 제안합니다. PRepair는 제어된 버그 주입과 최소-최대 샘플링을 통해 다양한 버그 발생 프로그램을 생성하는 Self-Breaking과, 최소적이면서 정확한 수정을 장려하기 위한 수정 인식 보상 함수를 활용한 수정 인식 그룹 상대 정책 최적화(EA-GRPO)로 모델을 학습하는 Self-Repairing의 두 가지 구성 요소로 이루어집니다. 실험 결과, PRepair는 수정 정확성과 범위를 함께 고려하는 지표인 fix_1@1 기준으로 수정 정밀도를 최대 31.4% 향상시키며, 스펙큘레이티브 편집(speculative editing)과 결합 시 디코딩 처리량을 크게 증가시켜 정밀하고 실용적인 코드 수정의 가능성을 입증했습니다.

English

Large Language Models (LLMs) achieve strong program repair performance but often suffer from over-editing, where excessive modifications overwrite correct code and hinder bug localization. We systematically quantify its impact and introduce precise repair task, which maximizes reuse of correct code while fixing only buggy parts. Building on this insight, we propose PRepair, a framework that mitigates over-editing and improves repair accuracy. PRepair has two components: Self-Breaking, which generates diverse buggy programs via controlled bug injection and min-max sampling, and Self-Repairing, which trains models with Edit-Aware Group Relative Policy Optimization (EA-GRPO) using an edit-aware reward to encourage minimal yet correct edits. Experiments show that PRepair improves repair precision by up to 31.4% under fix_1@1, a metric that jointly considers repair correctness and extent, and significantly increases decoding throughput when combined with speculative editing, demonstrating its potential for precise and practical code repair.

QiMeng-PRepair: 편집 인식 보상 최적화를 통한 정밀 코드 수리

QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization

초록

Support