光谱注意力引导实现提示词高亮

摘要

注意力引导是一项控制模型聚焦的关键技术，能够实现提示词高亮等功能，使模型优先处理用户指定的文本。然而现有注意力引导方法需显式存储完整注意力矩阵，导致其无法兼容FlashAttention等内存优化方案。我们提出谱编辑键值放大技术（SEKA），该免训练引导方法通过在注意力计算前直接编辑键值嵌入来解决这一问题。SEKA利用谱分解将键值嵌入引导至潜在方向，从而放大特定标记的注意力分数。我们进一步扩展出自适应SEKA（AdaSEKA），这种查询自适应变体通过免训练路由机制，基于提示词语义意图动态组合多个专家子空间。实验表明，两种方法在标准引导基准测试中均显著优于强基线方案，同时仅增加极低的延迟与内存开销，完全兼容优化后的注意力机制。

English

Attention steering is an important technique for controlling model focus, enabling capabilities such as prompt highlighting, where the model prioritises user-specified text. However, existing attention steering methods require explicit storage of the full attention matrix, making them incompatible with memory-efficient implementations like FlashAttention. We introduce Spectral Editing Key Amplification (SEKA), a training-free steering method that tackles this by directly editing key embeddings before attention computation. SEKA uses spectral decomposition to steer key embeddings towards latent directions that amplify attention scores for certain tokens. We extend this to Adaptive SEKA (AdaSEKA), a query-adaptive variant that uses a training-free routing mechanism to dynamically combine multiple expert subspaces based on the prompt's semantic intent. Our experiments show both methods significantly outperform strong baselines on standard steering benchmarks while adding much lower latency and memory overhead, in compatibility with optimised attention.