BordIRlines: 評価用のクロスリンガル検索増強生成のためのデータセット

要旨

大規模言語モデルは創造的生成に優れていますが、幻覚と偏見の問題には依然として苦しんでいます。検索拡張生成（RAG）は、LLMの応答を正確かつ最新の情報に基づかせる枠組みを提供しますが、その際には偏見の問題が発生します。つまり、どのソースが文脈に含まれるべきか、そしてそれらの重要性をどのように評価すべきかという問題です。本論文では、クロスリンガルRAGの課題を研究し、言語、文化、政治の境界が交差する地政学的紛争に関するクエリに対する既存システムの頑健性を調査するためのデータセットを提供します。当該クエリに関連する情報を含むWikipediaページからデータセットを収集し、追加の文脈を含めることや、その文脈の言語やソースの構成がLLMの応答に与える影響を調査します。結果から、既存のRAGシステムはクロスリンガルの使用例に依然として挑戦を受け、複数言語で競合する情報が提供されると一貫性の欠如が見られることが示されました。これらの問題を具体例で説明し、今後の研究がこれらの課題に対処するための手順を概説します。当該データセットとコードは、https://github.com/manestay/bordIRlines で公開されています。

English

Large language models excel at creative generation but continue to struggle with the issues of hallucination and bias. While retrieval-augmented generation (RAG) provides a framework for grounding LLMs' responses in accurate and up-to-date information, it still raises the question of bias: which sources should be selected for inclusion in the context? And how should their importance be weighted? In this paper, we study the challenge of cross-lingual RAG and present a dataset to investigate the robustness of existing systems at answering queries about geopolitical disputes, which exist at the intersection of linguistic, cultural, and political boundaries. Our dataset is sourced from Wikipedia pages containing information relevant to the given queries and we investigate the impact of including additional context, as well as the composition of this context in terms of language and source, on an LLM's response. Our results show that existing RAG systems continue to be challenged by cross-lingual use cases and suffer from a lack of consistency when they are provided with competing information in multiple languages. We present case studies to illustrate these issues and outline steps for future research to address these challenges. We make our dataset and code publicly available at https://github.com/manestay/bordIRlines.

BordIRlines: 評価用のクロスリンガル検索増強生成のためのデータセット

BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation

要旨

Support