パープレキシティトラップ：PLMベースの検索モデルは低パープレキシティ文書を過大評価する

要旨

先行研究では、PLMベースの検索モデルがLLM生成コンテンツに対して選好を示し、その意味的品質が人間が書いたものと同等である場合でも、これらの文書により高い関連性スコアを付与することが明らかになっています。この現象は「ソースバイアス」として知られ、情報アクセスエコシステムの持続可能な発展を脅かしています。しかし、ソースバイアスの根本的な原因は未解明のままでした。本論文では、因果グラフを用いて情報検索のプロセスを説明し、PLMベースの検索器が関連性推定のためにパープレキシティ特徴を学習し、パープレキシティの低い文書を高くランク付けすることでソースバイアスが生じることを明らかにしました。理論分析により、この現象は言語モデリングタスクと検索タスクにおける損失関数の勾配間の正の相関に起因することがさらに示されました。この分析に基づき、因果関係に着想を得た推論時のバイアス除去手法「Causal Diagnosis and Correction（CDC）」を提案します。CDCはまずパープレキシティのバイアス効果を診断し、次にそのバイアス効果を全体の関連性スコア推定から分離します。3つのドメインにわたる実験結果は、CDCの優れたバイアス除去効果を示しており、提案した説明フレームワークの有効性を強調しています。ソースコードはhttps://github.com/WhyDwelledOnAi/Perplexity-Trapで公開されています。

English

Previous studies have found that PLM-based retrieval models exhibit a preference for LLM-generated content, assigning higher relevance scores to these documents even when their semantic quality is comparable to human-written ones. This phenomenon, known as source bias, threatens the sustainable development of the information access ecosystem. However, the underlying causes of source bias remain unexplored. In this paper, we explain the process of information retrieval with a causal graph and discover that PLM-based retrievers learn perplexity features for relevance estimation, causing source bias by ranking the documents with low perplexity higher. Theoretical analysis further reveals that the phenomenon stems from the positive correlation between the gradients of the loss functions in language modeling task and retrieval task. Based on the analysis, a causal-inspired inference-time debiasing method is proposed, called Causal Diagnosis and Correction (CDC). CDC first diagnoses the bias effect of the perplexity and then separates the bias effect from the overall estimated relevance score. Experimental results across three domains demonstrate the superior debiasing effectiveness of CDC, emphasizing the validity of our proposed explanatory framework. Source codes are available at https://github.com/WhyDwelledOnAi/Perplexity-Trap.

パープレキシティトラップ：PLMベースの検索モデルは低パープレキシティ文書を過大評価する

Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents

要旨

Support