GEB-1.3B：开放式轻量级大型语言模型

摘要

最近开发的大型语言模型（LLMs）如ChatGPT、Claude和Llama展示了令人印象深刻的能力，甚至在几项任务中超越了人类水平的表现。尽管取得成功，但这些模型对资源的需求巨大，需要大量的计算资源用于训练和推断，限制了它们只能部署在高性能服务器上。此外，模型的广泛计算需求通常会导致响应时间延迟增加。随着对LLMs在CPU上高效运行的需求日益增加，出现了针对CPU推断进行优化的轻量级模型的研究。在这项工作中，我们介绍了GEB-1.3B，这是一个在中文和英文语言中训练的轻量级LLM，训练了5500亿个标记。我们采用了包括ROPE、Group-Query-Attention和FlashAttention-2在内的新颖训练技术，加速训练同时保持模型性能。此外，我们使用了1000万个指令数据样本对模型进行微调以增强对齐性。GEB-1.3B在MMLU、C-Eval和CMMLU等通用基准测试中表现出色，优于MindLLM-1.3B和TinyLLaMA-1.1B等比较模型。值得注意的是，GEB-1.3B的FP32版本在CPU上取得了可观的推断时间，正在通过先进的量化技术进一步提高速度。GEB-1.3B作为开源模型的发布对轻量级LLMs的发展做出了重要贡献，有望促进该领域的进一步研究和创新。

English

Recently developed large language models (LLMs) such as ChatGPT, Claude, and Llama have demonstrated impressive abilities, and even surpass human-level performance in several tasks. Despite their success, the resource-intensive demands of these models, requiring significant computational power for both training and inference, limit their deployment to high-performance servers. Additionally, the extensive calculation requirements of the models often lead to increased latency in response times. With the increasing need for LLMs to operate efficiently on CPUs, research about lightweight models that are optimized for CPU inference has emerged. In this work, we introduce GEB-1.3B, a lightweight LLM trained on 550 billion tokens in both Chinese and English languages. We employ novel training techniques, including ROPE, Group-Query-Attention, and FlashAttention-2, to accelerate training while maintaining model performance. Additionally, we fine-tune the model using 10 million samples of instruction data to enhance alignment. GEB-1.3B exhibits outstanding performance on general benchmarks such as MMLU, C-Eval, and CMMLU, outperforming comparative models such as MindLLM-1.3B and TinyLLaMA-1.1B. Notably, the FP32 version of GEB-1.3B achieves commendable inference times on CPUs, with ongoing efforts to further enhance speed through advanced quantization techniques. The release of GEB-1.3B as an open-source model marks a significant contribution to the development of lightweight LLMs, promising to foster further research and innovation in the field.

GEB-1.3B：开放式轻量级大型语言模型

GEB-1.3B: Open Lightweight Large Language Model

摘要

Support