KAPSO: 自律的なプログラム合成と最適化のための知識基盤フレームワーク

要旨

我々は、自律的なプログラム合成と最適化のためのモジュール型フレームワーク「KAPSO」を提案する。自然言語で記述された目標と評価手法が与えられると、KAPSOはアイデア創出、コード合成と編集、実行、評価、学習を反復的に実行し、測定可能な目標に向けて実行可能な成果物を改善する。KAPSOは合成を終点と見なすのではなく、長期的な最適化ループ内の演算子として活用し、進捗は評価器の結果によって定義される。 KAPSOは、実験状態の喪失、脆弱なデバッグ、領域専門知識の弱い再利用など、コーディングエージェントに共通する長期的な失敗課題に対処するため、密結合した3つのコンポーネントを統合する。第一に、gitネイティブな実験エンジンは各試行をブランチとして隔離し、再現可能な成果物を生成するとともに反復間での由来情報を保持する。第二に、知識システムはリポジトリ、内部プレイブック、ドキュメント、科学論文、ウェブ検索結果など様々な情報源を取り込み、ワークフロー、実装、環境制約にわたる検索をサポートする構造化表現へ整理する。第三に、認知メモリ層は検索を調整し、実験トレース（実行ログ、差分、評価器フィードバック）から抽出した再利用可能な教訓をエピソード記憶として維持することで、誤りの反復を減少させ収束を加速する。 KAPSOをMLE-Bench（Kaggle形式の機械学習コンペティション）とALE-Bench（AtCoderヒューリスティック最適化）で評価し、エンドツーエンドの性能を報告する。コードは以下で公開: https://github.com/Leeroo-AI/kapso

English

We introduce KAPSO, a modular framework for autonomous program synthesis and optimization. Given a natural language goal and an evaluation method, KAPSO iteratively performs ideation, code synthesis and editing, execution, evaluation, and learning to improve a runnable artifact toward measurable objectives. Rather than treating synthesis as the endpoint, KAPSO uses synthesis as an operator within a long-horizon optimization loop, where progress is defined by evaluator outcomes. KAPSO targets long-horizon failures common in coding agents, including lost experimental state, brittle debugging, and weak reuse of domain expertise, by integrating three tightly coupled components. First, a git-native experimentation engine isolates each attempt as a branch, producing reproducible artifacts and preserving provenance across iterations. Second, a knowledge system ingests heterogeneous sources, including repositories, internal playbooks, and curated external resources such as documentation, scientific papers, and web search results, and organizes them into a structured representation that supports retrieval over workflows, implementations, and environment constraints. Third, a cognitive memory layer coordinates retrieval and maintains an episodic store of reusable lessons distilled from experiment traces (run logs, diffs, and evaluator feedback), reducing repeated error modes and accelerating convergence. We evaluated KAPSO on MLE-Bench (Kaggle-style ML competitions) and ALE-Bench (AtCoder heuristic optimization), and report end-to-end performance. Code Available at: https://github.com/Leeroo-AI/kapso

KAPSO: 自律的なプログラム合成と最適化のための知識基盤フレームワーク

KAPSO: A Knowledge-grounded framework for Autonomous Program Synthesis and Optimization

要旨

Support