MPS-Prover:通過多視角搜索與數據策展推進逐步定理證明
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation
May 16, 2025
作者: Zhenwen Liang, Linfeng Song, Yang Li, Tao Yang, Feng Zhang, Haitao Mi, Dong Yu
cs.AI
摘要
在形式語言中的自動定理證明(Automated Theorem Proving, ATP)仍然是人工智慧領域中的一項艱鉅挑戰,它要求嚴密的邏輯推理並需在龐大的搜索空間中導航。儘管大型語言模型(LLMs)已展現出令人鼓舞的表現,現有的逐步證明器常因搜索指導的偏差而效率低下,導致證明策略次優。本文介紹了多視角搜索證明器(Multi-Perspective Search Prover, MPS-Prover),這是一種新穎的逐步ATP系統,旨在克服這些限制。MPS-Prover融合了兩項關鍵創新:一種高效的訓練後數據策展策略,能夠在不犧牲性能的情況下修剪約40%的冗餘訓練數據;以及一種多視角樹搜索機制。該搜索機制將學習到的評判模型與策略性設計的啟發式規則相結合,以多樣化戰術選擇,避免陷入無效狀態,並增強搜索的魯棒性。廣泛的評估表明,MPS-Prover在多個具有挑戰性的基準測試中,包括miniF2F和ProofNet,達到了最先進的性能,超越了之前擁有70億參數的模型。此外,我們的分析揭示,與現有的逐步和全證明方法相比,MPS-Prover生成的證明顯著更短且更多樣,突顯了其效率和有效性。我們的工作提升了基於LLM的形式推理能力,並為開發更強大的定理證明器提供了一個堅實的框架和全面的分析。
English
Automated Theorem Proving (ATP) in formal languages remains a formidable
challenge in AI, demanding rigorous logical deduction and navigating vast
search spaces. While large language models (LLMs) have shown promising
performance, existing stepwise provers often suffer from biased search
guidance, leading to inefficiencies and suboptimal proof strategies. This paper
introduces the Multi-Perspective Search Prover (MPS-Prover), a novel stepwise
ATP system designed to overcome these limitations. MPS-Prover incorporates two
key innovations: a highly effective post-training data curation strategy that
prunes approximately 40% of redundant training data without sacrificing
performance, and a multi-perspective tree search mechanism. This search
integrates a learned critic model with strategically designed heuristic rules
to diversify tactic selection, prevent getting trapped in unproductive states,
and enhance search robustness. Extensive evaluations demonstrate that
MPS-Prover achieves state-of-the-art performance on multiple challenging
benchmarks, including miniF2F and ProofNet, outperforming prior 7B parameter
models. Furthermore, our analyses reveal that MPS-Prover generates
significantly shorter and more diverse proofs compared to existing stepwise and
whole-proof methods, highlighting its efficiency and efficacy. Our work
advances the capabilities of LLM-based formal reasoning and offers a robust
framework and a comprehensive analysis for developing more powerful theorem
provers.Summary
AI-Generated Summary