ChatPaper.aiChatPaper

Lean 重構:通過智能體策略搜索的多目標可控證明優化

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

May 18, 2026
作者: Jialin Lu, Soonho Kong, Rodrigo Stehling, Kaiyu Yang, Zhangyang Wang, Weiran Sun, Wuyang Chen
cs.AI

摘要

我們提出 Lean Refactor,這是一個即插即用的檢索增強代理框架,專為 Lean 證明進行多目標、可控且具版本穩健性的重構。LLM 生成的證明以「正確但冗長」且在各函式庫版本間脆弱著稱,然而現有的重構工作忽略了三個實務挑戰:1) Lean 的重構本質上具有多目標性(證明長度、編譯成本與版本相容性經常相互衝突);2) Lean 儲存庫的相容性脆弱,而 LLM 的發布卻不考慮 Lean/Mathlib 版本;3) 基於訓練的流水線在每次新的 LLM 發布時都需要反覆微調,既無法隨模型更迭規模化,也無法跟上 Lean 的發布週期。Lean Refactor 透過從精心設計的多目標重構策略資料庫中進行檢索,來引導一個凍結的代理 LLM,該資料庫中的每條策略都附帶豐富的元數據,例如支援的 Lean/Mathlib 版本與預期的編譯成本降低。實驗結果顯示,在競賽基準測試上實現超過 70% 的詞元級壓縮,在研究型儲存庫上超過 20%,同時編譯時間最多減少 60%,優於先前的工作與 Claude Code。基於版本的過濾檢索進一步改善了目標 Lean 版本上的壓縮效果,且經過重構的 miniF2F 證明在向未來 Lean 版本進行零樣本版本遷移時,表現優於未經重構的證明。
English
We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.