利用自注意力机制实现大语言模型中的输入依赖性软提示
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
June 5, 2025
作者: Ananth Muppidi, Abhilash Nandy, Sambaran Bandyopadhyay
cs.AI
摘要
大型語言模型在特定領域任務中的表現,往往需要進行微調,這一過程既耗費計算資源又具備技術挑戰性。本文聚焦於採用軟提示的參數高效微調方法,這是一種通過學習少量參數來使預訓練模型適應下游任務的潛力途徑。我們提出了一種新穎的基於輸入依賴的軟提示技術,該技術結合了自注意力機制(ID-SPAM),能夠根據輸入令牌生成軟提示,並對不同令牌賦予不同的重要性關注。我們的方法簡潔高效,保持了可訓練參數的數量較少。我們展示了所提方法相較於現有頂尖技術在多項任務上的優勢,並展現了其提升的零樣本領域遷移能力。
English
The performance of large language models in domain-specific tasks
necessitates fine-tuning, which is computationally expensive and technically
challenging. This paper focuses on parameter-efficient fine-tuning using soft
prompting, a promising approach that adapts pre-trained models to downstream
tasks by learning a small set of parameters. We propose a novel Input Dependent
Soft Prompting technique with a self-Attention Mechanism (ID-SPAM) that
generates soft prompts based on the input tokens and attends different tokens
with varying importance. Our method is simple and efficient, keeping the number
of trainable parameters small. We show the merits of the proposed approach
compared to state-of-the-art techniques on various tasks and show the improved
zero shot domain transfer capability.Summary
AI-Generated Summary