ChatPaper.aiChatPaper

TinyLlama:一个开源的小型语言模型

TinyLlama: An Open-Source Small Language Model

January 4, 2024
作者: Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu
cs.AI

摘要

我们介绍TinyLlama,这是一个紧凑的语言模型,预训练使用约1万亿标记,大约进行了3个时代。在Llama 2的架构和分词器基础上,TinyLlama利用了开源社区贡献的各种进展(例如FlashAttention),实现了更好的计算效率。尽管体积相对较小,TinyLlama在一系列下游任务中展现出显著的性能。它在各项任务中明显优于现有的开源语言模型,且体积相当。我们的模型检查点和代码可在GitHub上公开获取,网址为https://github.com/jzhang38/TinyLlama。
English
We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.
PDF9514December 15, 2024