AM-Thinking-v1:在320亿规模上推进推理前沿
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale
May 13, 2025
作者: Yunjie Ji, Xiaoyu Tian, Sitong Zhao, Haotian Wang, Shuaiting Chen, Yiping Peng, Han Zhao, Xiangang Li
cs.AI
摘要
我们推出AM-Thinking-v1,这是一款32B密集语言模型,它推动了推理技术的前沿,体现了开源创新的协作精神。AM-Thinking-v1在性能上超越了DeepSeek-R1,并与领先的专家混合模型(MoE)如Qwen3-235B-A22B和Seed1.5-Thinking相媲美,在AIME 2024上取得了85.3分,在AIME 2025上取得了74.4分,在LiveCodeBench上取得了70.3分,展示了在相似规模的开源模型中顶尖的数学和编程能力。
AM-Thinking-v1完全基于开源Qwen2.5-32B基础模型和公开可用的查询构建,通过精心设计的后训练流程——结合了监督微调和强化学习——提供了卓越的推理能力。这项工作表明,开源社区能够在32B规模上实现高性能,这是一个适合部署和微调的实用甜点。通过在顶级性能和实际可用性之间取得平衡,我们希望AM-Thinking-v1能激发更多合作努力,利用中等规模模型,推动推理边界,同时将可访问性置于创新的核心。我们已在Hugging Face平台上开源了我们的模型,网址为https://huggingface.co/a-m-team/AM-Thinking-v1。
English
We present AM-Thinking-v1, a 32B dense language model that advances the
frontier of reasoning, embodying the collaborative spirit of open-source
innovation. Outperforming DeepSeek-R1 and rivaling leading Mixture-of-Experts
(MoE) models like Qwen3-235B-A22B and Seed1.5-Thinking, AM-Thinking-v1 achieves
impressive scores of 85.3 on AIME 2024, 74.4 on AIME 2025, and 70.3 on
LiveCodeBench, showcasing state-of-the-art mathematical and coding capabilities
among open-source models of similar scale.
Built entirely from the open-source Qwen2.5-32B base model and publicly
available queries, AM-Thinking-v1 leverages a meticulously crafted
post-training pipeline - combining supervised fine-tuning and reinforcement
learning - to deliver exceptional reasoning capabilities. This work
demonstrates that the open-source community can achieve high performance at the
32B scale, a practical sweet spot for deployment and fine-tuning. By striking a
balance between top-tier performance and real-world usability, we hope
AM-Thinking-v1 inspires further collaborative efforts to harness mid-scale
models, pushing reasoning boundaries while keeping accessibility at the core of
innovation. We have open-sourced our model on
https://huggingface.co/a-m-team/AM-Thinking-v1{Hugging Face}.