I-SHEEP: 反復的自己強化パラダイムによるゼロからの大規模言語モデルの自己アラインメント

要旨

大規模言語モデル（LLMs）は大きな進歩を遂げてきたが、一般的な学習パラダイムではLLMsを受動的な情報リポジトリとして扱い、能動的な学習とアラインメントの可能性を無視している。一部のアプローチでは、LLMsが生成した合成データを用いて訓練を行い、能動的なアラインメントの可能性を探っている。しかし、これらの一度きりのアラインメント手法と、人間の継続的かつ自動的なアラインメントとの間には依然として大きな隔たりがある。本論文では、I-SHEEP（Iterative Self-EnHancEmEnt Paradigm）を提案する。この人間に似たパラダイムにより、LLMsは何もない状態から継続的に自己アラインメントを行うことができる。本論文の最初のイテレーションとして参照される一度きりのアラインメント手法Dromedary（sun2023principledriven）と比較して、I-SHEEPはQwenおよびLlamaモデルの能力を大幅に向上させることができる。I-SHEEPは、Qwen-1.5 72Bモデルにおいて、Alpaca Evalで最大78.2%、MT Benchで24.0%、IFEvalの精度で8.88%の絶対的な向上を達成した。さらに、I-SHEEPは、様々な標準ベンチマーク生成タスクにおいてベースモデルを上回り、コード生成タスクで平均24.77%、TrivialQAで12.04%、SQuADで20.29%の改善を達成した。また、実験結果に基づいて新たな洞察も提供する。我々のコード、データセット、モデルはhttps://anonymous.4open.science/r/I-SHEEPで公開されている。

English

Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment. Some approaches train LLMs using their own generated synthetic data, exploring the possibility of active alignment. However, there is still a huge gap between these one-time alignment methods and the continuous automatic alignment of humans. In this paper, we introduce I-SHEEP, an Iterative Self-EnHancEmEnt Paradigm.This human-like paradigm enables LLMs to continuously self-align from scratch with nothing. Compared to the one-time alignment method Dromedary sun2023principledriven, which refers to the first iteration in this paper, I-SHEEP can significantly enhance capacities on both Qwen and Llama models. I-SHEEP achieves a maximum relative improvement of 78.2\% in the Alpaca Eval, 24.0\% in the MT Bench, and an absolute increase of 8.88\% in the IFEval accuracy over subsequent iterations in Qwen-1.5 72B model. Additionally, I-SHEEP surpasses the base model in various standard benchmark generation tasks, achieving an average improvement of 24.77\% in code generation tasks, 12.04\% in TrivialQA, and 20.29\% in SQuAD. We also provide new insights based on the experiment results. Our codes, datasets, and models are available at https://anonymous.4open.science/r/I-SHEEP.

I-SHEEP: 反復的自己強化パラダイムによるゼロからの大規模言語モデルの自己アラインメント

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

要旨

Support