デバッグ衰退指数：コードLLMのためのデバッグ戦略の再考

要旨

AIデバッグの有効性は予測可能な指数関数的減衰パターンに従う。ほとんどのモデルでは、実用的なコード生成システムにとって重要な能力である反復的デバッグにもかかわらず、わずか2～3回の試行でデバッグ能力の60～80%を失う。本論文では、デバッグが無効になるタイミングを定量化し、介入ポイントを予測する数学的フレームワークである「デバッグ減衰指数（Debugging Decay Index, DDI）」を提案する。我々の戦略的リスタートアプローチは、デバッグプロセスの戦略的ポイントで探索から活用へとシフトし、適切なタイミングでの介入がデバッグの有効性を回復できることを実証する。DDIは、現在のAIデバッグにおける根本的な限界を明らかにし、反復的コード生成戦略を最適化するための初の定量的フレームワークを提供する。

English

The effectiveness of AI debugging follows a predictable exponential decay pattern; most models lose 60-80% of their debugging capability within just 2-3 attempts, despite iterative debugging being a critical capability for practical code generation systems. We introduce the Debugging Decay Index (DDI), a mathematical framework that quantifies when debugging becomes ineffective and predicts intervention points. Our strategic fresh start approach shifts from exploitation to exploration at strategic points in the debugging process, demonstrating that well-timed interventions can rescue the effectiveness of debugging. DDI reveals a fundamental limitation in current AI debugging and provides the first quantitative framework for optimising iterative code generation strategies.

デバッグ衰退指数：コードLLMのためのデバッグ戦略の再考

The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs

要旨

Support