SEVerA：自我演化智能體的驗證式合成

摘要

近期研究顯示，自進化大型語言模型代理在程式修復與科學發現等任務上展現顯著成效。此範式中，規劃型LLM會合成包含參數化模型（含LLMs）調用的代理程式，並針對各任務進行參數調優以提升效能。然而現有的自進化代理框架缺乏對安全性與正確性的形式化保證。由於這類程式常需在未知輸入上自主執行，此保證缺失引發了可靠性與安全性的隱憂。我們將代理程式碼生成建模為具約束的學習問題，融合刻畫任務效用的軟性目標與硬性形式化規格。我們提出形式化守護生成模型（FGGM），使規劃LLM能透過一階邏輯為每個生成模型調用指定形式化輸出契約。每個FGGM調用會將底層模型封裝於帶有可驗證備援機制的拒絕取樣器中，確保所有返回輸出在任何輸入與參數設定下皆滿足契約。基於FGGM，我們建構三階段框架SEVerA（自進化驗證代理）：搜索階段合成包含FGGM調用的候選參數化程式；驗證階段針對所有參數值證明硬性約束下的正確性，將問題簡化為無約束學習；學習階段採用可擴展的梯度優化（包括GRPO風格的微調）來提升軟性目標同時保持正確性。我們在Dafny程式驗證、符號數學合成及策略合規代理工具使用（τ^2-bench）任務上評估SEVerA。結果顯示SEVerA在所有任務中均實現零約束違反，且效能超越無約束與現有最佳基線，證實形式化行為約束不僅能保證正確性，更能引導合成出更高品質的代理。

English

Recent advances have shown the effectiveness of self-evolving LLM agents on tasks such as program repair and scientific discovery. In this paradigm, a planner LLM synthesizes an agent program that invokes parametric models, including LLMs, which are then tuned per task to improve performance. However, existing self-evolving agent frameworks provide no formal guarantees of safety or correctness. Because such programs are often executed autonomously on unseen inputs, this lack of guarantees raises reliability and security concerns. We formulate agentic code generation as a constrained learning problem, combining hard formal specifications with soft objectives capturing task utility. We introduce Formally Guarded Generative Models (FGGM), which allow the planner LLM to specify a formal output contract for each generative model call using first-order logic. Each FGGM call wraps the underlying model in a rejection sampler with a verified fallback, ensuring every returned output satisfies the contract for any input and parameter setting. Building on FGGM, we present SEVerA (Self-Evolving Verified Agents), a three-stage framework: Search synthesizes candidate parametric programs containing FGGM calls; Verification proves correctness with respect to hard constraints for all parameter values, reducing the problem to unconstrained learning; and Learning applies scalable gradient-based optimization, including GRPO-style fine-tuning, to improve the soft objective while preserving correctness. We evaluate SEVerA on Dafny program verification, symbolic math synthesis, and policy-compliant agentic tool use (τ^2-bench). Across tasks, SEVerA achieves zero constraint violations while improving performance over unconstrained and SOTA baselines, showing that formal behavioral constraints not only guarantee correctness but also steer synthesis toward higher-quality agents.

SEVerA：自我演化智能體的驗證式合成

SEVerA: Verified Synthesis of Self-Evolving Agents

摘要

Support