理解LLMs:从训练到推断的全面概述
Understanding LLMs: A Comprehensive Overview from Training to Inference
January 4, 2024
作者: Yiheng Liu, Hao He, Tianle Han, Xu Zhang, Mengyuan Liu, Jiaming Tian, Yutong Zhang, Jiaqi Wang, Xiaohui Gao, Tianyang Zhong, Yi Pan, Shaochen Xu, Zihao Wu, Zhengliang Liu, Xin Zhang, Shu Zhang, Xintao Hu, Tuo Zhang, Ning Qiang, Tianming Liu, Bao Ge
cs.AI
摘要
ChatGPT的推出显著增加了大型语言模型(LLMs)用于解决下游任务的利用率。在这一背景下,对于成本高效的训练和部署越来越受到关注。低成本训练和部署LLMs代表了未来的发展趋势。本文回顾了大型语言模型训练技术和推断部署技术的演变,与这一新兴趋势保持一致。讨论训练涵盖了各个方面,包括数据预处理、训练架构、预训练任务、并行训练,以及与模型微调相关的内容。在推断方面,本文涵盖了模型压缩、并行计算、内存调度和结构优化等主题。同时,探讨了LLMs的利用以及对其未来发展的见解。
English
The introduction of ChatGPT has led to a significant increase in the
utilization of Large Language Models (LLMs) for addressing downstream tasks.
There's an increasing focus on cost-efficient training and deployment within
this context. Low-cost training and deployment of LLMs represent the future
development trend. This paper reviews the evolution of large language model
training techniques and inference deployment technologies aligned with this
emerging trend. The discussion on training includes various aspects, including
data preprocessing, training architecture, pre-training tasks, parallel
training, and relevant content related to model fine-tuning. On the inference
side, the paper covers topics such as model compression, parallel computation,
memory scheduling, and structural optimization. It also explores LLMs'
utilization and provides insights into their future development.