在解決奧林匹亞幾何問題上獲得金牌表現,使用 AlphaGeometry2。
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
February 5, 2025
作者: Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, Thang Luong
cs.AI
摘要
我們介紹了AlphaGeometry2,這是Trinh等人(2024年)提出的AlphaGeometry的顯著改進版本,現在已超越了平均金牌得主在解決奧林匹亞幾何問題方面的能力。為了實現這一點,我們首先擴展了原始的AlphaGeometry語言,以應對涉及物體運動的更難問題,以及包含角度、比例和距離的線性方程的問題。這些改進與其他添加一起,顯著提高了AlphaGeometry語言在2000-2024年國際數學奧林匹亞(IMO)幾何問題中的覆蓋率,從66%提高到88%。AlphaGeometry2的搜索過程也得到了很大改善,通過使用Gemini架構進行更好的語言建模,以及一種結合多個搜索樹的新型知識共享機制。再加上對符號引擎和合成數據生成的進一步增強,我們將AlphaGeometry2對過去25年所有幾何問題的整體解決率顯著提升到84%,而之前為54%。AlphaGeometry2也是在IMO 2024年獲得銀牌標準的系統的一部分。最後,我們報告了在將AlphaGeometry2作為完全自動化系統的一部分,可可靠地從自然語言輸入直接解決幾何問題的進展。
English
We present AlphaGeometry2, a significantly improved version of AlphaGeometry
introduced in Trinh et al. (2024), which has now surpassed an average gold
medalist in solving Olympiad geometry problems. To achieve this, we first
extend the original AlphaGeometry language to tackle harder problems involving
movements of objects, and problems containing linear equations of angles,
ratios, and distances. This, together with other additions, has markedly
improved the coverage rate of the AlphaGeometry language on International Math
Olympiads (IMO) 2000-2024 geometry problems from 66% to 88%. The search process
of AlphaGeometry2 has also been greatly improved through the use of Gemini
architecture for better language modeling, and a novel knowledge-sharing
mechanism that combines multiple search trees. Together with further
enhancements to the symbolic engine and synthetic data generation, we have
significantly boosted the overall solving rate of AlphaGeometry2 to 84% for
all geometry problems over the last 25 years, compared to 54%
previously. AlphaGeometry2 was also part of the system that achieved
silver-medal standard at IMO 2024 https://dpmd.ai/imo-silver. Last but not
least, we report progress towards using AlphaGeometry2 as a part of a fully
automated system that reliably solves geometry problems directly from natural
language input.