C-RADIOv4(技術報告)
C-RADIOv4 (Tech Report)
January 24, 2026
作者: Mike Ranzinger, Greg Heinrich, Collin McCarthy, Jan Kautz, Andrew Tao, Bryan Catanzaro, Pavlo Molchanov
cs.AI
摘要
透過運用多教師蒸餾技術,聚合式視覺骨幹網路提供了一個統一的學生模型,該模型不僅保留更提升了多位教師的獨特能力。在本技術報告中,我們介紹C-RADIO模型系列的最新版本C-RADIOv4,其設計基於AM-RADIO/RADIOv2.5架構,在維持相同計算複雜度的前提下,於關鍵下游任務實現顯著提升。我們發布了-SO400M(4.12億參數)與-H(6.31億參數)兩種模型變體,二者均採用升級後的教師模型組合進行訓練:SigLIP2、DINOv3及SAM3。除了核心指標的進步與模仿SAM3帶來的新能力外,C-RADIOv4模型系列進一步強化了任意解析度支援功能,重新引入ViTDet選項以實現高解析度下的極致效能提升,並採用開放許可協議。
English
By leveraging multi-teacher distillation, agglomerative vision backbones provide a unified student model that retains and improves the distinct capabilities of multiple teachers. In this tech report, we describe the most recent release of the C-RADIO family of models, C-RADIOv4, which builds upon AM-RADIO/RADIOv2.5 in design, offering strong improvements on key downstream tasks at the same computational complexity. We release -SO400M (412M params), and -H (631M) model variants, both trained with an updated set of teachers: SigLIP2, DINOv3, and SAM3. In addition to improvements on core metrics and new capabilities from imitating SAM3, the C-RADIOv4 model family further improves any-resolution support, brings back the ViTDet option for drastically enhanced efficiency at high-resolution, and comes with a permissive license.