CORG：從複雜且相互關聯的上下文中生成答案

摘要

在現實世界的語料庫中，知識經常在文檔之間重複出現，但由於命名模糊、信息過時或錯誤，往往存在不一致之處，導致上下文之間形成複雜的相互關係。先前的研究表明，語言模型在處理這些複雜性時存在困難，通常僅孤立地關注單一因素。我們將這些關係分類為四種類型：分散注意力的、模糊的、反事實的以及重複的。我們的分析揭示，沒有一種方法能同時有效解決所有這些相互關係。因此，我們引入了上下文組織器（CORG），這是一個將多個上下文組織成獨立處理組的框架。這種設計使模型能夠高效地找到所有相關答案，同時確保消除歧義。CORG由三個關鍵組件組成：圖構造器、重新排序器和聚合器。我們的結果表明，CORG在性能和效率之間取得了有效平衡，優於現有的分組方法，並達到了與計算更密集的單一上下文方法相當的結果。

English

In a real-world corpus, knowledge frequently recurs across documents but often contains inconsistencies due to ambiguous naming, outdated information, or errors, leading to complex interrelationships between contexts. Previous research has shown that language models struggle with these complexities, typically focusing on single factors in isolation. We classify these relationships into four types: distracting, ambiguous, counterfactual, and duplicated. Our analysis reveals that no single approach effectively addresses all these interrelationships simultaneously. Therefore, we introduce Context Organizer (CORG), a framework that organizes multiple contexts into independently processed groups. This design allows the model to efficiently find all relevant answers while ensuring disambiguation. CORG consists of three key components: a graph constructor, a reranker, and an aggregator. Our results demonstrate that CORG balances performance and efficiency effectively, outperforming existing grouping methods and achieving comparable results to more computationally intensive, single-context approaches.

CORG：從複雜且相互關聯的上下文中生成答案

CORG: Generating Answers from Complex, Interrelated Contexts

摘要

Summary

Support

Support