Hypencoder: 情報検索のためのハイパーネットワーク

要旨

ほとんどの検索モデルは、クエリと文書の間の関連性スコアを生成するためにベクトル内積に依存しています。これにより、使用できる関連性スコアの表現力が自然に制限されます。我々は新しいパラダイムを提案します。クエリを表すベクトルを生成する代わりに、学習された関連性関数として機能する小さなニューラルネットワークを生成します。この小さなニューラルネットワークは、文書の表現を受け取り、本論文では単一のベクトルを使用し、スカラーの関連性スコアを生成します。この小さなニューラルネットワークを生成するために、ハイパーネットワークを使用し、他のネットワークの重みを生成するネットワーク、つまり私たちがHypencoderと呼ぶクエリエンコーダとして使用します。ドメイン内の検索タスクでの実験では、Hypencoderが強力な密な検索モデルを大幅に上回り、再ランキングモデルや桁違いに大きなモデルよりも高いメトリクスを持つことが示されました。Hypencoderは、ドメイン外の検索タスクにも適切に汎化されることが示されています。Hypencoderの能力の程度を評価するために、tip-of-the-tongue検索やinstruction-following検索などの一連の難しい検索タスクで評価し、標準的な検索タスクと比較して性能差が大幅に拡大することが分かりました。さらに、当社の手法の実用性を示すために、近似検索アルゴリズムを実装し、モデルが60ms未満で8.8Mの文書を検索できることを示します。

English

The vast majority of retrieval models depend on vector inner products to produce a relevance score between a query and a document. This naturally limits the expressiveness of the relevance score that can be employed. We propose a new paradigm, instead of producing a vector to represent the query we produce a small neural network which acts as a learned relevance function. This small neural network takes in a representation of the document, in this paper we use a single vector, and produces a scalar relevance score. To produce the little neural network we use a hypernetwork, a network that produce the weights of other networks, as our query encoder or as we call it a Hypencoder. Experiments on in-domain search tasks show that Hypencoder is able to significantly outperform strong dense retrieval models and has higher metrics then reranking models and models an order of magnitude larger. Hypencoder is also shown to generalize well to out-of-domain search tasks. To assess the extent of Hypencoder's capabilities, we evaluate on a set of hard retrieval tasks including tip-of-the-tongue retrieval and instruction-following retrieval tasks and find that the performance gap widens substantially compared to standard retrieval tasks. Furthermore, to demonstrate the practicality of our method we implement an approximate search algorithm and show that our model is able to search 8.8M documents in under 60ms.

Hypencoder: 情報検索のためのハイパーネットワーク

Hypencoder: Hypernetworks for Information Retrieval

要旨

Support