ChatPaper.aiChatPaper

InfiR:打造高效的小型語言模型與多模態小型語言模型於推理任務中

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

February 17, 2025
作者: Congkai Xie, Shuo Cai, Wenjun Wang, Pengxiang Li, Zhijie Sang, Kejing Yang, Yiming Zhang, Zhen Li, Guanghao Zhu, Zeyu Liu, Yang Yu, Yuhang Liu, Su Lu, Baoyi He, Qi Zhou, Xiaotian Han, Jianbo Yuan, Shengyu Zhang, Fei Wu, Hongxia Yang
cs.AI

摘要

大型語言模型(LLMs)與多模態大型語言模型(MLLMs)在推理能力上取得了顯著進展。然而,它們仍面臨高計算需求與隱私問題等挑戰。本文聚焦於開發具備競爭力推理能力的高效小型語言模型(SLMs)與多模態小型語言模型(MSLMs)。我們提出了一種新穎的訓練流程,該流程不僅提升了推理能力,還便於在邊緣設備上部署,實現了頂尖性能的同時最大限度地降低了開發成本。\InfR~ 旨在通過縮小模型規模來提升AI系統的推理能力、降低採用門檻並解決隱私問題。相關資源可於 https://github.com/Reallm-Labs/InfiR 獲取。
English
Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have made significant advancements in reasoning capabilities. However, they still face challenges such as high computational demands and privacy concerns. This paper focuses on developing efficient Small Language Models (SLMs) and Multimodal Small Language Models (MSLMs) that retain competitive reasoning abilities. We introduce a novel training pipeline that enhances reasoning capabilities and facilitates deployment on edge devices, achieving state-of-the-art performance while minimizing development costs. \InfR~ aims to advance AI systems by improving reasoning, reducing adoption barriers, and addressing privacy concerns through smaller model sizes. Resources are available at https://github. com/Reallm-Labs/InfiR.

Summary

AI-Generated Summary

PDF82February 20, 2025