大規模言語モデルにおける入力依存型ソフトプロンプティングのためのセルフアテンションの活用

要旨

大規模言語モデルのドメイン固有タスクにおける性能向上には、計算コストが高く技術的にも困難なファインチューニングが必要です。本論文では、事前学習済みモデルを下流タスクに適応させるために、少数のパラメータを学習する有望なアプローチであるソフトプロンプティングを用いたパラメータ効率的なファインチューニングに焦点を当てます。我々は、入力トークンに基づいてソフトプロンプトを生成し、異なるトークンに異なる重要度を割り当てる自己注意機構を備えた新しいInput Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM)を提案します。本手法はシンプルで効率的であり、学習可能なパラメータ数を少なく保ちます。我々は、様々なタスクにおいて提案手法が最先端技術と比較して優れていることを示し、ゼロショットドメイン転送能力の向上を実証します。

English

The performance of large language models in domain-specific tasks necessitates fine-tuning, which is computationally expensive and technically challenging. This paper focuses on parameter-efficient fine-tuning using soft prompting, a promising approach that adapts pre-trained models to downstream tasks by learning a small set of parameters. We propose a novel Input Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM) that generates soft prompts based on the input tokens and attends different tokens with varying importance. Our method is simple and efficient, keeping the number of trainable parameters small. We show the merits of the proposed approach compared to state-of-the-art techniques on various tasks and show the improved zero shot domain transfer capability.

大規模言語モデルにおける入力依存型ソフトプロンプティングのためのセルフアテンションの活用

Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs

要旨

Support