ChatPaper.aiChatPaper

金牌级奥赛几何解题与高效启发式辅助构造

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

November 27, 2025
作者: Boyan Duan, Xiao Liang, Shuai Lu, Yaoxiang Wang, Yelong Shen, Kai-Wei Chang, Ying Nian Wu, Mao Yang, Weizhu Chen, Yeyun Gong
cs.AI

摘要

欧几里得几何的自动定理证明,特别是针对国际数学奥林匹克(IMO)难度级别的问题,仍是人工智能领域的重大挑战与重要研究方向。本文提出一种完全在CPU上运行、不依赖神经网络推理的高效几何定理证明方法。初步研究表明,简单的随机辅助点添加策略即可在IMO问题上达到银牌级别的人类表现。基于此,我们进一步提出HAGeo——一种基于启发式规则的几何演绎辅助构造方法,该方法在IMO-30基准测试中成功解决30道题中的28道,达到金牌级别表现,并以明显优势超越基于神经网络的竞争性方法AlphaGeometry。为更全面评估现有方法,我们构建了包含409道人工标注难度几何题的HAGeo-409基准测试集。与广泛使用的IMO-30相比,新基准提出了更大挑战,能提供更精确的评估,为几何定理证明设立了更高标准。
English
Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.
PDF11December 4, 2025