잔여 프롬프트 튜닝: 잔여 재매개변수화를 통한 프롬프트 튜닝 개선

초록

프롬프트 튜닝(Prompt Tuning)은 사전 훈련된 언어 모델의 매개변수 효율적 튜닝을 위한 성공적인 접근법 중 하나이다. 가장 매개변수 효율적인 방법으로 간주되지만(튜닝된 소프트 프롬프트가 전체 매개변수의 0.1% 미만을 차지함), 일반적으로 다른 효율적인 튜닝 방법들보다 성능이 떨어지며 하이퍼파라미터에 상당히 민감하다. 본 연구에서는 프롬프트 튜닝의 성능과 안정성을 크게 향상시키는 간단하고 효율적인 방법인 잔여 프롬프트 튜닝(Residual Prompt Tuning)을 소개한다. 우리는 잔여 연결(Residual Connection)을 가진 얕은 네트워크를 사용하여 소프트 프롬프트 임베딩을 재매개변수화하는 방법을 제안한다. 실험 결과, 잔여 프롬프트 튜닝은 SuperGLUE 벤치마크에서 프롬프트 튜닝을 크게 능가하는 것으로 나타났다. 특히, T5-Base 모델을 사용한 프롬프트 튜닝 대비 7점 이상의 성능 향상을 달성했으며, 성능 저하 없이 프롬프트 길이를 10분의 1로 줄일 수 있었다. 또한, 우리의 접근법은 학습률과 프롬프트 초기화 선택에 강건하며, 소수 샷(Few-shot) 설정에서도 효과적임을 보여준다.

English

Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically performs worse than other efficient tuning methods and is quite sensitive to hyper-parameters. In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning. We propose to reparameterize soft prompt embeddings using a shallow network with a residual connection. Our experiments show that Residual Prompt Tuning significantly outperforms prompt tuning on SuperGLUE benchmark. Notably, our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance. In addition, we show that our approach is robust to the choice of learning rate and prompt initialization, and is effective in few-shot settings.

잔여 프롬프트 튜닝: 잔여 재매개변수화를 통한 프롬프트 튜닝 개선

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

초록

Support