ChatPaper.aiChatPaper

Apriel-1.5-15b-思考者

Apriel-1.5-15b-Thinker

October 1, 2025
作者: Shruthan Radhakrishna, Aman Tiwari, Aanjaneya Shukla, Masoud Hashemi, Rishabh Maheshwary, Shiva Krishna Reddy Malay, Jash Mehta, Pulkit Pattnaik, Saloni Mittal, Khalil Slimi, Kelechi Ogueji, Akintunde Oladipo, Soham Parikh, Oluwanifemi Bamgbose, Toby Liang, Ahmed Masry, Khyati Mahajan, Sai Rajeswar Mudumba, Vikas Yadav, Sathwik Tejaswi Madhusudhan, Torsten Scholak, Sagar Davasam, Srinivas Sunkara, Nicholas Chapados
cs.AI

摘要

我們推出Apriel-1.5-15B-Thinker,這是一個擁有150億參數的開放權重多模態推理模型,通過精心設計的訓練而非單純的規模擴展,達到了前沿性能水平。基於Pixtral-12B,我們採用了漸進式的三階段方法:(1)深度擴展以增強推理能力,無需從頭開始預訓練;(2)分階段持續預訓練,首先建立基礎的文本與視覺理解,然後通過針對性的合成數據生成來提升視覺推理能力,涵蓋空間結構、組合理解及細粒度感知;(3)高質量的純文本監督微調,使用精心挑選的指令-響應對,並包含明確的推理軌跡,涉及數學、編程、科學及工具使用。值得注意的是,我們的模型在未使用強化學習或偏好優化的情況下取得了競爭力結果,凸顯了我們以數據為中心的持續預訓練方法的貢獻。在Artificial Analysis Intelligence Index上,Apriel-1.5-15B-Thinker獲得了52分,與DeepSeek-R1-0528持平,但所需計算資源顯著減少。在十個圖像基準測試中,其性能平均與Gemini-2.5-Flash和Claude Sonnet-3.7相差不到五分,這對於在單GPU部署限制下運行的模型而言是一項關鍵成就。我們的結果表明,深思熟慮的中期訓練設計能夠在不依賴大規模擴展的情況下彌補顯著的能力差距,使具備有限基礎設施的組織也能觸及前沿的多模態推理技術。我們在MIT許可下發布了模型檢查點、所有訓練配方及評估協議,以推動開源研究的發展。
English
We present Apriel-1.5-15B-Thinker, a 15-billion parameter open-weights multimodal reasoning model that achieves frontier-level performance through training design rather than sheer scale. Starting from Pixtral-12B, we apply a progressive three-stage methodology: (1) depth upscaling to expand reasoning capacity without pretraining from scratch, (2) staged continual pre-training that first develops foundational text and vision understanding, then enhances visual reasoning through targeted synthetic data generation addressing spatial structure, compositional understanding, and fine-grained perception, and (3) high-quality text-only supervised fine-tuning on curated instruction-response pairs with explicit reasoning traces spanning mathematics, coding, science, and tool use. Notably, our model achieves competitive results without reinforcement learning or preference optimization, isolating the contribution of our data-centric continual pre-training approach. On the Artificial Analysis Intelligence Index, Apriel-1.5-15B-Thinker attains a score of 52, matching DeepSeek-R1-0528 despite requiring significantly fewer computational resources. Across ten image benchmarks, its performance is on average within five points of Gemini-2.5-Flash and Claude Sonnet-3.7, a key achievement for a model operating within single-GPU deployment constraints. Our results demonstrate that thoughtful mid-training 2 design can close substantial capability gaps without massive scale, making frontier-level multimodal reasoning accessible to organizations with limited infrastructure. We release the model checkpoint, all training recipes, and evaluation protocols under the MIT license to to advance open-source research.
PDF1054October 6, 2025