命令追従型情報検索のためのデュアルビュートレーニング

要旨

命令追従型情報検索（IF-IR）は、クエリに関連する文書を見つけるだけでなく、必須属性や除外条件、出力設定などの明示的なユーザー制約に従う必要がある検索システムを研究する分野です。しかし、ほとんどの検索モデルは主に意味的関連性を重視して訓練されており、トピックに合致する文書と命令を満たす文書を区別できないことが多いです。我々は極性反転に基づくデュアルビューデータ合成戦略を提案します。具体的には、クエリ、命令の下で関連性のある文書、クエリには合致するが命令に違反するハードネガティブ文書が与えられたとき、LLMを用いて二つの文書の関連性ラベルが入れ替わる相補的な命令を生成します。関連性ラベルが反転した相補的命令の下で同一の文書ペアを提示することにより、訓練信号は検索モデルに固定的なトピックの手がかりに依存するのではなく、命令を通じて同一の候補集合を再評価することを強制します。3億500万パラメータのエンコーダモデルにおいて、本手法はFollowIRベンチマークの性能を45%向上させ、同等または更大規模の汎用埋め込みモデルを凌駕します。データ量を統一した直接比較を通じて、データの多様性と命令監督が相補的な役割を果たすことをさらに示します。前者は一般的な検索品質を維持し、後者は命令への感度を向上させます。これらの結果は、広範な能力と命令認識性を兼ね備えた検索システム構築における、標的型データ合成の価値を浮き彫りにしています。

English

Instruction-following information retrieval (IF-IR) studies retrieval systems that must not only find documents relevant to a query, but also obey explicit user constraints such as required attributes, exclusions, or output preferences. However, most retrievers are trained primarily for semantic relevance and often fail to distinguish documents that match the topic from those that satisfy the instruction. We propose a dual-view data synthesis strategy based on polarity reversal: given a query, a document that is relevant under the instruction, and a hard negative that matches the query but violates the instruction, we prompt an LLM to generate a complementary instruction under which the two documents swap relevance labels. By presenting the same document pair under complementary instructions that invert their relevance labels, the training signal forces the retriever to reconsider the same candidate set through the instruction, rather than relying on fixed topical cues. On a 305M-parameter encoder, our method improves performance on the FollowIR benchmark by 45%, surpassing general-purpose embedding models of comparable or larger scale. Through head-to-head comparisons at matched data budgets, we further show that data diversity and instruction supervision play complementary roles: the former preserves general retrieval quality, while the latter improves instruction sensitivity. These results highlight the value of targeted data synthesis for building retrieval systems that are both broadly capable and instruction-aware.

命令追従型情報検索のためのデュアルビュートレーニング

Dual-View Training for Instruction-Following Information Retrieval

要旨

Support