ChatPaper.aiChatPaper

Apriel-1.5-15b-思考者

Apriel-1.5-15b-Thinker

October 1, 2025
作者: Shruthan Radhakrishna, Aman Tiwari, Aanjaneya Shukla, Masoud Hashemi, Rishabh Maheshwary, Shiva Krishna Reddy Malay, Jash Mehta, Pulkit Pattnaik, Saloni Mittal, Khalil Slimi, Kelechi Ogueji, Akintunde Oladipo, Soham Parikh, Oluwanifemi Bamgbose, Toby Liang, Ahmed Masry, Khyati Mahajan, Sai Rajeswar Mudumba, Vikas Yadav, Sathwik Tejaswi Madhusudhan, Torsten Scholak, Sagar Davasam, Srinivas Sunkara, Nicholas Chapados
cs.AI

摘要

我们推出Apriel-1.5-15B-Thinker,这是一个拥有150亿参数的开源权重多模态推理模型,通过精心设计的训练而非单纯规模扩展,实现了前沿性能。基于Pixtral-12B,我们采用渐进式三阶段方法:(1)深度扩展,无需从头预训练即可提升推理能力;(2)分阶段持续预训练,首先建立基础文本与视觉理解,随后通过针对空间结构、组合理解及细粒度感知的合成数据生成,强化视觉推理;(3)高质量纯文本监督微调,使用涵盖数学、编程、科学及工具使用的精选指令-响应对,并包含显式推理轨迹。值得注意的是,我们的模型在未使用强化学习或偏好优化的情况下取得竞争性成果,凸显了数据为中心的持续预训练策略的贡献。在Artificial Analysis智能指数上,Apriel-1.5-15B-Thinker获得52分,与DeepSeek-R1-0528持平,却显著减少了计算资源需求。在十项图像基准测试中,其表现平均与Gemini-2.5-Flash和Claude Sonnet-3.7相差不到五分,这对于单GPU部署限制下的模型而言,是一项关键成就。我们的结果表明,深思熟虑的中期训练设计能够在不依赖大规模扩展的情况下弥合显著能力差距,使资源有限的组织也能触及前沿多模态推理。我们根据MIT许可证发布模型检查点、所有训练配方及评估协议,以推动开源研究发展。
English
We present Apriel-1.5-15B-Thinker, a 15-billion parameter open-weights multimodal reasoning model that achieves frontier-level performance through training design rather than sheer scale. Starting from Pixtral-12B, we apply a progressive three-stage methodology: (1) depth upscaling to expand reasoning capacity without pretraining from scratch, (2) staged continual pre-training that first develops foundational text and vision understanding, then enhances visual reasoning through targeted synthetic data generation addressing spatial structure, compositional understanding, and fine-grained perception, and (3) high-quality text-only supervised fine-tuning on curated instruction-response pairs with explicit reasoning traces spanning mathematics, coding, science, and tool use. Notably, our model achieves competitive results without reinforcement learning or preference optimization, isolating the contribution of our data-centric continual pre-training approach. On the Artificial Analysis Intelligence Index, Apriel-1.5-15B-Thinker attains a score of 52, matching DeepSeek-R1-0528 despite requiring significantly fewer computational resources. Across ten image benchmarks, its performance is on average within five points of Gemini-2.5-Flash and Claude Sonnet-3.7, a key achievement for a model operating within single-GPU deployment constraints. Our results demonstrate that thoughtful mid-training 2 design can close substantial capability gaps without massive scale, making frontier-level multimodal reasoning accessible to organizations with limited infrastructure. We release the model checkpoint, all training recipes, and evaluation protocols under the MIT license to to advance open-source research.
PDF1054October 6, 2025