残差提示调整:利用残差重新参数化改进提示调整
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
May 6, 2023
作者: Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi
cs.AI
摘要
提示调整是参数高效调整预训练语言模型的成功方法之一。尽管可能是最参数高效的方法之一(调整的软提示占总参数的<0.1%),但通常表现不如其他高效调整方法,并且对超参数非常敏感。在这项工作中,我们引入了残差提示调整 - 一种简单高效的方法,显著提高了提示调整的性能和稳定性。我们建议使用带有残差连接的浅层网络重新参数化软提示嵌入。我们的实验表明,残差提示调整在SuperGLUE基准测试中明显优于提示调整。值得注意的是,我们的方法在T5-Base上比提示调整提高了+7个点,并且可以将提示长度缩短10倍而不影响性能。此外,我们展示了我们的方法对学习率和提示初始化的选择具有鲁棒性,并且在少样本设置中非常有效。
English
Prompt tuning is one of the successful approaches for parameter-efficient
tuning of pre-trained language models. Despite being arguably the most
parameter-efficient (tuned soft prompts constitute <0.1% of total parameters),
it typically performs worse than other efficient tuning methods and is quite
sensitive to hyper-parameters. In this work, we introduce Residual Prompt
Tuning - a simple and efficient method that significantly improves the
performance and stability of prompt tuning. We propose to reparameterize soft
prompt embeddings using a shallow network with a residual connection. Our
experiments show that Residual Prompt Tuning significantly outperforms prompt
tuning on SuperGLUE benchmark. Notably, our method reaches +7 points
improvement over prompt tuning with T5-Base and allows to reduce the prompt
length by 10x without hurting performance. In addition, we show that our
approach is robust to the choice of learning rate and prompt initialization,
and is effective in few-shot settings.