大規模言語モデルの理解：トレーニングから推論までの包括的概要

要旨

ChatGPTの登場により、下流タスクに対処するための大規模言語モデル（LLMs）の利用が大幅に増加しています。この文脈において、コスト効率の良いトレーニングとデプロイメントに焦点が当てられつつあります。LLMsの低コストなトレーニングとデプロイメントは、今後の開発トレンドを代表するものです。本論文では、この新たなトレンドに沿った大規模言語モデルのトレーニング技術と推論デプロイメント技術の進化を概観します。トレーニングに関する議論では、データの前処理、トレーニングアーキテクチャ、事前学習タスク、並列トレーニング、およびモデルのファインチューニングに関連する内容など、さまざまな側面をカバーしています。推論側では、モデルの圧縮、並列計算、メモリスケジューリング、構造最適化などのトピックを取り上げています。また、LLMsの利用についても探求し、その将来の発展に関する洞察を提供します。

English

The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and inference deployment technologies aligned with this emerging trend. The discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning. On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization. It also explores LLMs' utilization and provides insights into their future development.

大規模言語モデルの理解：トレーニングから推論までの包括的概要

Understanding LLMs: A Comprehensive Overview from Training to Inference

要旨

Support