真偽を超えて：ニュアンスを含む主張に対する検索拡張型階層的分析

要旨

個人や組織による主張は、しばしば微妙なニュアンスを含み、完全に「真」または「偽」と明確に分類することが難しい場合がある。これは、科学的および政治的な主張において特に顕著である。しかし、ある主張（例えば、「ワクチンAはワクチンBよりも優れている」）は、その構成要素や下位要素（例えば、有効性、安全性、流通性）に分解することが可能であり、それぞれの要素は個別に検証しやすい。これにより、特定の問題に対する包括的で構造化された応答を提供しつつ、読者がその主張の中でも特に興味のある側面（例えば、子供に対する安全性）を優先的に検討することが可能となる。そこで、我々はClaimSpectを提案する。これは、ある主張を扱う際に典型的に考慮される側面の階層を自動的に構築し、それらをコーパス固有の視点で補強する、検索拡張生成ベースのフレームワークである。この構造は、入力されたコーパスを階層的に分割し、関連するセグメントを検索することで、新たな下位側面の発見を支援する。さらに、これらのセグメントは、主張の特定の側面に対する多様な視点（例えば、支持、中立、反対）とそれらの普及度（例えば、「ワクチンAがBよりも輸送性が高いと考える生物医学論文はいくつあるか？」）を発見することを可能にする。我々は、構築したデータセットに含まれる多様な現実世界の科学的および政治的な主張に対してClaimSpectを適用し、微妙な主張を分解し、コーパス内の視点を表現する際のその堅牢性と正確性を実証する。現実世界のケーススタディと人間による評価を通じて、複数のベースラインに対するその有効性を検証する。

English

Claims made by individuals or entities are oftentimes nuanced and cannot be clearly labeled as entirely "true" or "false" -- as is frequently the case with scientific and political claims. However, a claim (e.g., "vaccine A is better than vaccine B") can be dissected into its integral aspects and sub-aspects (e.g., efficacy, safety, distribution), which are individually easier to validate. This enables a more comprehensive, structured response that provides a well-rounded perspective on a given problem while also allowing the reader to prioritize specific angles of interest within the claim (e.g., safety towards children). Thus, we propose ClaimSpect, a retrieval-augmented generation-based framework for automatically constructing a hierarchy of aspects typically considered when addressing a claim and enriching them with corpus-specific perspectives. This structure hierarchically partitions an input corpus to retrieve relevant segments, which assist in discovering new sub-aspects. Moreover, these segments enable the discovery of varying perspectives towards an aspect of the claim (e.g., support, neutral, or oppose) and their respective prevalence (e.g., "how many biomedical papers believe vaccine A is more transportable than B?"). We apply ClaimSpect to a wide variety of real-world scientific and political claims featured in our constructed dataset, showcasing its robustness and accuracy in deconstructing a nuanced claim and representing perspectives within a corpus. Through real-world case studies and human evaluation, we validate its effectiveness over multiple baselines.

真偽を超えて：ニュアンスを含む主張に対する検索拡張型階層的分析

Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims

要旨

Support