AlphaApollo: 기초 모델과 전문 도구를 자가 진화 시스템으로 통합하여 심층 에이전트 추론 구현

초록

우리는 AlphaApollo라는 자가 진화 에이전트 추론 시스템을 소개합니다. 이 시스템은 기초 모델(FM) 추론의 두 가지 병목 현상, 즉 모델 고유의 제한된 용량과 신뢰할 수 없는 테스트 시간 반복을 해결하기 위해 설계되었습니다. AlphaApollo는 여러 모델을 전문 도구와 함께 조율하여 신중하고 검증 가능한 추론을 가능하게 합니다. 이 시스템은 (i) 계산 도구(수치 및 기호 라이브러리가 포함된 Python)와 (ii) 검색 도구(작업 관련 외부 정보)를 결합하여 정확한 계산과 근거 있는 결정을 실행합니다. 또한, AlphaApollo는 후보, 실행 가능한 검사 및 반복적 개선을 위한 피드백을 기록하는 공유 상태 맵을 통해 다중 라운드, 다중 모델 솔루션 진화를 지원합니다. AIME 2024/2025에서 여러 모델을 대상으로 한 평가에서 AlphaApollo는 일관된 성능 향상을 보였습니다: Qwen2.5-14B-Instruct의 경우 Average@32에서 +5.15%, Pass@32에서 +23.34%의 향상을, Llama-3.3-70B-Instruct의 경우 Average@32에서 +8.91%, Pass@32에서 +26.67%의 향상을 달성했습니다. 도구 사용 분석 결과, 80% 이상의 도구 호출이 성공적으로 실행되었으며, 도구를 사용하지 않은 기준선보다 일관되게 우수한 성능을 보여 FMs의 능력 한계를 높였습니다. 더 많은 실험 결과와 구현 세부 사항은 https://github.com/tmlr-group/AlphaApollo에서 업데이트될 예정입니다.

English

We present AlphaApollo, a self-evolving agentic reasoning system that aims to address two bottlenecks in foundation model (FM) reasoning-limited model-intrinsic capacity and unreliable test-time iteration. AlphaApollo orchestrates multiple models with professional tools to enable deliberate, verifiable reasoning. It couples (i) a computation tool (Python with numerical and symbolic libraries) and (ii) a retrieval tool (task-relevant external information) to execute exact calculations and ground decisions. The system further supports multi-round, multi-model solution evolution via a shared state map that records candidates, executable checks, and feedback for iterative refinement. In evaluations on AIME 2024/2025 across multiple models, AlphaApollo delivers consistent gains: +5.15% Average@32 and +23.34% Pass@32 for Qwen2.5-14B-Instruct, and +8.91% Average@32 with +26.67% Pass@32 for Llama-3.3-70B-Instruct. Tool-use analysis shows that more than 80% of tool calls are successfully executed, with consistent outperformance of non-tool baselines, thereby lifting the capability ceiling of FMs. More empirical results and implementation details will be updated at https://github.com/tmlr-group/AlphaApollo.

AlphaApollo: 기초 모델과 전문 도구를 자가 진화 시스템으로 통합하여 심층 에이전트 추론 구현

AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning

초록

Support