Prism-Δ：大規模言語モデルにおけるプロンプト強調のための差分部分空間制御

要旨

プロンプトハイライティングは、大規模言語モデルが生成時にユーザー指定のテキストスパンを優先するよう導く手法である。重要な課題は、関連コンテキストと非関連コンテキストの両方に共通する構造パターンではなく、それらの差を捉える導出方向を抽出することである。我々はPRISM-Δ（Projection-based Relevance-Informed Steering Method）を提案する。これは、正のクロス共分散行列と負のクロス共分散行列の差を分解し、共通方向を除去しながら識別エネルギーを最大化する。各アテンションヘッドには連続的なsoftplus重要度重みが割り当てられ、弱いが有用なヘッドも低減された強度で貢献できる。本フレームワークはValue表現に自然に拡張され、Keyのみの手法が活用しないコンテンツチャネル信号を捉える。4つのベンチマークと5つのモデルにわたる評価では、PRISM-Δは20設定中19において既存最良手法を匹敵または上回り、最大+10.6%の相対改善を示しながら、ステアリングによる流暢性コストを半減した。PRISM-Δは長文コンテキスト検索にもスケーラブルであり、既存最良手法を最大+4.8%上回った。PRISM-ΔはFlashAttentionと互換性があり、無視できるメモリオーバーヘッドしか追加しない。

English

Prompt highlighting steers a large language model to prioritize user-specified text spans during generation. A key challenge is extracting steering directions that capture the difference between relevant and irrelevant contexts, rather than shared structural patterns common to both. We propose PRISM-Δ (Projection-based Relevance-Informed Steering Method), which decomposes the difference between positive and negative cross-covariance matrices to maximize discriminative energy while eliminating shared directions. Each attention head receives a continuous softplus importance weight, letting weak-but-useful heads contribute at reduced strength. The framework extends naturally to Value representations, capturing content-channel signal that Key-only methods leave unused. Across four benchmarks and five models, PRISM-Δ matches or exceeds the best existing method on 19 of 20 configurations, with relative gains up to +10.6%, while halving the fluency cost of steering. PRISM-Δ also scales to long-context retrieval, outperforming the best existing method by up to +4.8% relative gain. PRISM-Δ is compatible with FlashAttention and adds negligible memory overhead.

Prism-Δ：大規模言語モデルにおけるプロンプト強調のための差分部分空間制御

Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models

要旨

Support