ChatPaper.aiChatPaper

TinyLlama:一個開源的小型語言模型

TinyLlama: An Open-Source Small Language Model

January 4, 2024
作者: Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu
cs.AI

摘要

我們介紹TinyLlama,一個緊湊的 11 億語言模型,預先訓練約 1 兆標記,大約進行 3 個時代。基於 Llama 2 的架構和分詞器,TinyLlama 利用開源社區貢獻的各種進展(例如 FlashAttention),實現更好的計算效率。儘管體積相對較小,TinyLlama 在一系列下游任務中展現出卓越的性能。它明顯優於現有的具有相當大小的開源語言模型。我們的模型檢查點和代碼可在 GitHub 上公開獲取,網址為 https://github.com/jzhang38/TinyLlama。
English
We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.
PDF9514December 15, 2024