MobiLlama：朝向準確且輕量級的全透明 GPT

摘要

近年來，大型語言模型（LLMs）發展的主要趨勢是「越大越好」。然而，LLMs 不適合需要在設備上進行處理、節能、低記憶體佔用和響應效率的情況。這些要求對於隱私、安全和可持續部署至關重要。本文通過探索「少即是多」的範式，解決了為資源受限設備設計準確而高效的小型語言模型（SLMs）的挑戰。我們的主要貢獻是引入一個準確且完全透明的開源 0.5 十億（0.5B）參數的 SLM，名為 MobiLlama，以滿足資源受限計算的特定需求，強調在降低資源需求的同時提高性能。MobiLlama 是一種 SLM 設計，從一個較大的模型開始，並應用謹慎的參數共享方案，以降低預訓練和部署成本。我們的工作不僅致力於彌合開源 SLM 的差距，還確保完全透明，提供完整的訓練數據管道、訓練代碼、模型權重以及超過 300 個檢查點和評估代碼，可在以下網址找到：https://github.com/mbzuai-oryx/MobiLlama。

English

"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. However, LLMs do not suit well for scenarios that require on-device processing, energy efficiency, low memory footprint, and response efficiency. These requisites are crucial for privacy, security, and sustainable deployment. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices. Our primary contribution is the introduction of an accurate and fully transparent open-source 0.5 billion (0.5B) parameter SLM, named MobiLlama, catering to the specific needs of resource-constrained computing with an emphasis on enhanced performance with reduced resource demands. MobiLlama is a SLM design that initiates from a larger model and applies a careful parameter sharing scheme to reduce both the pre-training and the deployment cost. Our work strives to not only bridge the gap in open-source SLMs but also ensures full transparency, where complete training data pipeline, training code, model weights, and over 300 checkpoints along with evaluation codes is available at : https://github.com/mbzuai-oryx/MobiLlama.

MobiLlama：朝向準確且輕量級的全透明 GPT

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

摘要

Support