MobiLlama:朝着准确且轻量级的全透明 GPT 迈进
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
February 26, 2024
作者: Omkar Thawakar, Ashmal Vayani, Salman Khan, Hisham Cholakal, Rao M. Anwer, Michael Felsberg, Tim Baldwin, Eric P. Xing, Fahad Shahbaz Khan
cs.AI
摘要
近年来,大语言模型(LLMs)的发展主要趋势是“越大越好”。然而,LLMs 并不适用于需要在设备上进行处理、能效高、内存占用低和响应高效的场景。这些要求对于隐私、安全和可持续部署至关重要。本文通过探讨“少即是多”的范式,解决了为资源受限设备设计准确而高效的小语言模型(SLMs)的挑战。我们的主要贡献是引入了一个准确且完全透明的开源 5 亿(0.5B)参数的SLM,名为MobiLlama,满足资源受限计算的特定需求,重点在于提高性能同时降低资源需求。MobiLlama 是一个SLM设计,从一个较大的模型开始,并应用谨慎的参数共享方案,以降低预训练和部署成本。我们的工作不仅致力于弥合开源SLMs的差距,还确保完全透明,提供完整的训练数据管道、训练代码、模型权重和超过300个检查点以及评估代码,可在以下链接找到:https://github.com/mbzuai-oryx/MobiLlama。
English
"Bigger the better" has been the predominant trend in recent Large Language
Models (LLMs) development. However, LLMs do not suit well for scenarios that
require on-device processing, energy efficiency, low memory footprint, and
response efficiency. These requisites are crucial for privacy, security, and
sustainable deployment. This paper explores the "less is more" paradigm by
addressing the challenge of designing accurate yet efficient Small Language
Models (SLMs) for resource constrained devices. Our primary contribution is the
introduction of an accurate and fully transparent open-source 0.5 billion
(0.5B) parameter SLM, named MobiLlama, catering to the specific needs of
resource-constrained computing with an emphasis on enhanced performance with
reduced resource demands. MobiLlama is a SLM design that initiates from a
larger model and applies a careful parameter sharing scheme to reduce both the
pre-training and the deployment cost. Our work strives to not only bridge the
gap in open-source SLMs but also ensures full transparency, where complete
training data pipeline, training code, model weights, and over 300 checkpoints
along with evaluation codes is available at :
https://github.com/mbzuai-oryx/MobiLlama.