ChatPaper.aiChatPaper

MPS-Prover:通过多视角搜索与数据精炼推进逐步定理证明

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

May 16, 2025
作者: Zhenwen Liang, Linfeng Song, Yang Li, Tao Yang, Feng Zhang, Haitao Mi, Dong Yu
cs.AI

摘要

在形式语言中的自动定理证明(ATP)仍然是人工智能领域的一项艰巨挑战,它要求严格的逻辑推理并需在庞大的搜索空间中导航。尽管大型语言模型(LLMs)已展现出令人鼓舞的表现,现有的逐步证明器常因搜索引导的偏差而效率低下,导致证明策略不够优化。本文介绍了多视角搜索证明器(MPS-Prover),一种新颖的逐步ATP系统,旨在克服这些局限。MPS-Prover融合了两项关键创新:一是高效的训练后数据筛选策略,能在不牺牲性能的前提下削减约40%的冗余训练数据;二是多视角树搜索机制。该机制将学习到的评判模型与精心设计的启发式规则相结合,以多样化策略选择,避免陷入无效状态,并增强搜索的鲁棒性。广泛的评估表明,MPS-Prover在包括miniF2F和ProofNet在内的多个挑战性基准测试中达到了最先进的性能,超越了之前拥有70亿参数的模型。此外,我们的分析揭示,与现有的逐步及全证明方法相比,MPS-Prover生成的证明显著更短且更具多样性,凸显了其高效性与有效性。我们的研究推进了基于LLM的形式推理能力,并为开发更强大的定理证明器提供了一个稳健的框架和全面的分析。
English
Automated Theorem Proving (ATP) in formal languages remains a formidable challenge in AI, demanding rigorous logical deduction and navigating vast search spaces. While large language models (LLMs) have shown promising performance, existing stepwise provers often suffer from biased search guidance, leading to inefficiencies and suboptimal proof strategies. This paper introduces the Multi-Perspective Search Prover (MPS-Prover), a novel stepwise ATP system designed to overcome these limitations. MPS-Prover incorporates two key innovations: a highly effective post-training data curation strategy that prunes approximately 40% of redundant training data without sacrificing performance, and a multi-perspective tree search mechanism. This search integrates a learned critic model with strategically designed heuristic rules to diversify tactic selection, prevent getting trapped in unproductive states, and enhance search robustness. Extensive evaluations demonstrate that MPS-Prover achieves state-of-the-art performance on multiple challenging benchmarks, including miniF2F and ProofNet, outperforming prior 7B parameter models. Furthermore, our analyses reveal that MPS-Prover generates significantly shorter and more diverse proofs compared to existing stepwise and whole-proof methods, highlighting its efficiency and efficacy. Our work advances the capabilities of LLM-based formal reasoning and offers a robust framework and a comprehensive analysis for developing more powerful theorem provers.

Summary

AI-Generated Summary

PDF72May 19, 2025