LLM에서 입력 의존적 소프트 프롬프팅을 위한 셀프 어텐션 활용

초록

대규모 언어 모델의 도메인 특화 작업 성능을 높이기 위해서는 미세 조정(fine-tuning)이 필수적이지만, 이는 계산 비용이 많이 들고 기술적으로도 어려운 과정입니다. 본 논문은 사전 학습된 모델을 하위 작업에 적응시키기 위해 소수의 매개변수만을 학습하는 유망한 접근 방식인 소프트 프롬프팅(soft prompting)을 활용한 매개변수 효율적 미세 조정에 초점을 맞춥니다. 우리는 입력 토큰에 기반하여 소프트 프롬프트를 생성하고, 각 토큰에 서로 다른 중요도를 부여하는 자기 주의 메커니즘(self-attention mechanism)을 갖춘 새로운 입력 의존적 소프트 프롬프팅 기법(ID-SPAM)을 제안합니다. 우리의 방법은 간단하고 효율적이며, 학습 가능한 매개변수의 수를 적게 유지합니다. 다양한 작업에서 최신 기술과 비교하여 제안된 접근 방식의 장점을 보여주고, 개선된 제로 샷 도메인 전이 능력을 입증합니다.

English

The performance of large language models in domain-specific tasks necessitates fine-tuning, which is computationally expensive and technically challenging. This paper focuses on parameter-efficient fine-tuning using soft prompting, a promising approach that adapts pre-trained models to downstream tasks by learning a small set of parameters. We propose a novel Input Dependent Soft Prompting technique with a self-Attention Mechanism (ID-SPAM) that generates soft prompts based on the input tokens and attends different tokens with varying importance. Our method is simple and efficient, keeping the number of trainable parameters small. We show the merits of the proposed approach compared to state-of-the-art techniques on various tasks and show the improved zero shot domain transfer capability.

LLM에서 입력 의존적 소프트 프롬프팅을 위한 셀프 어텐션 활용

Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs

초록

Support