CORG: 複雑で相互に関連する文脈からの回答生成

要旨

現実世界のコーパスにおいて、知識は文書間で頻繁に繰り返し現れるが、曖昧な命名、古い情報、または誤りにより不整合が生じることが多く、これが文脈間の複雑な相互関係を引き起こす。これまでの研究では、言語モデルがこれらの複雑さに対処する際に苦戦し、通常は単一の要素に孤立して焦点を当てることが示されている。我々はこれらの関係を、注意散漫、曖昧、反事実的、重複の4つのタイプに分類する。分析の結果、これらの相互関係を同時に効果的に解決する単一のアプローチは存在しないことが明らかとなった。そこで、我々はContext Organizer（CORG）を導入する。これは、複数の文脈を独立して処理されるグループに整理するフレームワークである。この設計により、モデルは曖昧さを解消しつつ、関連するすべての回答を効率的に見つけることができる。CORGは、グラフ構築器、リランカー、アグリゲーターの3つの主要なコンポーネントで構成される。結果として、CORGは性能と効率性を効果的にバランスさせ、既存のグループ化手法を上回り、より計算集約的な単一文脈アプローチと同等の結果を達成することが示された。

English

In a real-world corpus, knowledge frequently recurs across documents but often contains inconsistencies due to ambiguous naming, outdated information, or errors, leading to complex interrelationships between contexts. Previous research has shown that language models struggle with these complexities, typically focusing on single factors in isolation. We classify these relationships into four types: distracting, ambiguous, counterfactual, and duplicated. Our analysis reveals that no single approach effectively addresses all these interrelationships simultaneously. Therefore, we introduce Context Organizer (CORG), a framework that organizes multiple contexts into independently processed groups. This design allows the model to efficiently find all relevant answers while ensuring disambiguation. CORG consists of three key components: a graph constructor, a reranker, and an aggregator. Our results demonstrate that CORG balances performance and efficiency effectively, outperforming existing grouping methods and achieving comparable results to more computationally intensive, single-context approaches.

CORG: 複雑で相互に関連する文脈からの回答生成

CORG: Generating Answers from Complex, Interrelated Contexts

要旨

Support