AM-Thinking-v1：在320亿规模上推进推理前沿

摘要

我们推出AM-Thinking-v1，这是一款拥有320亿参数的密集语言模型，它推动了推理技术的前沿，体现了开源创新的协作精神。该模型不仅超越了DeepSeek-R1，还与顶尖的专家混合模型（MoE）如Qwen3-235B-A22B和Seed1.5-Thinking相媲美，在AIME 2024、AIME 2025和LiveCodeBench上分别取得了85.3、74.4和70.3的优异成绩，展示了在同等规模开源模型中领先的数学与编程能力。 AM-Thinking-v1完全基于开源基础模型Qwen2.5-32B构建，并利用公开可用的查询数据，通过精心设计的后训练流程——结合了监督微调和强化学习——实现了卓越的推理能力。这项工作证明，开源社区能够在320亿参数这一实际部署与微调的理想规模上实现高性能。通过在顶级性能与现实世界可用性之间取得平衡，我们希望AM-Thinking-v1能激励更多合作，共同挖掘中等规模模型的潜力，在推动推理边界的同时，将可访问性置于创新的核心。我们已在Hugging Face平台开源此模型，地址为https://huggingface.co/a-m-team/AM-Thinking-v1。

English

We present AM-Thinking-v1, a 32B dense language model that advances the frontier of reasoning, embodying the collaborative spirit of open-source innovation. Outperforming DeepSeek-R1 and rivaling leading Mixture-of-Experts (MoE) models like Qwen3-235B-A22B and Seed1.5-Thinking, AM-Thinking-v1 achieves impressive scores of 85.3 on AIME 2024, 74.4 on AIME 2025, and 70.3 on LiveCodeBench, showcasing state-of-the-art mathematical and coding capabilities among open-source models of similar scale. Built entirely from the open-source Qwen2.5-32B base model and publicly available queries, AM-Thinking-v1 leverages a meticulously crafted post-training pipeline - combining supervised fine-tuning and reinforcement learning - to deliver exceptional reasoning capabilities. This work demonstrates that the open-source community can achieve high performance at the 32B scale, a practical sweet spot for deployment and fine-tuning. By striking a balance between top-tier performance and real-world usability, we hope AM-Thinking-v1 inspires further collaborative efforts to harness mid-scale models, pushing reasoning boundaries while keeping accessibility at the core of innovation. We have open-sourced our model on https://huggingface.co/a-m-team/AM-Thinking-v1{Hugging Face}.