パラメータ効率型チューニングにより、テキスト入力のためのLLMのスケーラブルなパーソナライゼーションが可能に：略語展開に関する事例研究

要旨

略語展開は、タイピング量を制限し、言語モデルを用いて展開候補を提案することで、コミュニケーションを高速化する戦略です。ここでは、特にユーザーデータが少量（約1000サンプル）の場合に、過去の会話に基づいて大規模言語モデル（LLM）の提案をパーソナライズし、予測の関連性を高める方法を検討します。具体的には、略語入力に対する展開テキストの提案において、ファインチューニング、プロンプトチューニング、および検索拡張生成を比較します。ALSを患う実在のユーザーに対して展開された8BパラメータのLLMのケーススタディと、映画キャラクターのパーソナライゼーションに関する実験から、以下のことが示されました：(1) 一部のシナリオではカスタマイズが必要であり、プロンプトチューニングがそれらにうまく汎化する、(2) ドメイン内データ（600サンプル程度）でのファインチューニングでもある程度の効果が見られるが、(3) 検索拡張による少数ショット選択はファインチューニングを上回る、(4) パラメータ効率的なチューニングにより、効率的かつスケーラブルなパーソナライゼーションが可能である。また、プロンプトチューニングにおいて、学習された「ソフトプロンプト」をユーザー関連の概念トークンで初期化すると、ランダム初期化よりも精度が高くなることがわかりました。

English

Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior conversations to enhance the relevance of predictions, particularly when the user data is small (~1000 samples). Specifically, we compare fine-tuning, prompt-tuning, and retrieval augmented generation of expanded text suggestions for abbreviated inputs. Our case study with a deployed 8B parameter LLM on a real user living with ALS, and experiments on movie character personalization indicates that (1) customization may be necessary in some scenarios and prompt-tuning generalizes well to those, (2) fine-tuning on in-domain data (with as few as 600 samples) still shows some gains, however (3) retrieval augmented few-shot selection also outperforms fine-tuning. (4) Parameter efficient tuning allows for efficient and scalable personalization. For prompt-tuning, we also find that initializing the learned "soft-prompts" to user relevant concept tokens leads to higher accuracy than random initialization.

パラメータ効率型チューニングにより、テキスト入力のためのLLMのスケーラブルなパーソナライゼーションが可能に：略語展開に関する事例研究

Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion

要旨

Support