SemiEvol: 半教師付きファインチューニングによるLLM適応

要旨

監督されたファインチューニング（SFT）は、大規模言語モデル（LLMs）を特定のドメインやタスクに適応させる上で重要です。しかしながら、実用的なアプリケーションでは限られたラベル付きデータしか利用できず、これはSFTが満足のいく結果をもたらす上で深刻な課題となります。そのため、ラベル付きデータと未ラベルデータの両方を活用するデータ効率の良いフレームワークが期待されています。この目的を達成するために、私たちは伝播と選択の手法からLLMの適応におけるセミ・スーパーバイズド・ファインチューニングフレームワークであるSemiEvolを紹介します。知識の伝播において、SemiEvolはバイレベルアプローチを採用し、ラベル付きデータから未ラベルデータへの知識伝播を重みおよびコンテキストの両方の方法で行います。知識の選択において、SemiEvolは協調学習メカニズムを組み込み、より高品質な疑似応答サンプルを選択します。私たちは、GPT-4o-miniとLlama-3.1を用いて、一般または特定のドメインに関連する7つのデータセットで実験を行い、対象データにおけるモデルの性能の著しい向上を示しました。さらに、SemiEvolをSFTおよび自己進化手法と比較し、ハイブリッドデータシナリオにおける実用性を強調しました。

English

Supervised fine-tuning (SFT) is crucial in adapting large language models (LLMs) to a specific domain or task. However, only a limited amount of labeled data is available in practical applications, which poses a severe challenge for SFT in yielding satisfactory results. Therefore, a data-efficient framework that can fully exploit labeled and unlabeled data for LLM fine-tuning is highly anticipated. Towards this end, we introduce a semi-supervised fine-tuning framework named SemiEvol for LLM adaptation from a propagate-and-select manner. For knowledge propagation, SemiEvol adopts a bi-level approach, propagating knowledge from labeled data to unlabeled data through both in-weight and in-context methods. For knowledge selection, SemiEvol incorporates a collaborative learning mechanism, selecting higher-quality pseudo-response samples. We conducted experiments using GPT-4o-mini and Llama-3.1 on seven general or domain-specific datasets, demonstrating significant improvements in model performance on target data. Furthermore, we compared SemiEvol with SFT and self-evolution methods, highlighting its practicality in hybrid data scenarios.

SemiEvol: 半教師付きファインチューニングによるLLM適応

SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation

要旨

Support