ObfusQAte: 難読化された事実質問応答におけるLLMの堅牢性を評価するための提案フレームワーク

要旨

大規模言語モデル（LLM）の急速な普及は、事実に基づく質問応答（QA）が可能な公平なAIシステムの開発に大きく貢献してきました。しかし、難読化されたバージョンの質問を提示された際のLLMの堅牢性を検証する研究はこれまで存在しませんでした。これらの限界を体系的に評価するため、我々は新たな技術「ObfusQAte」を提案し、これを活用して「ObfusQA」を導入します。ObfusQAは、LLMの能力を3つの異なる次元で検証するために設計された、多層的な難読化レベルを備えた初の包括的フレームワークです。その次元とは、(i) 固有表現の間接化、(ii) ディストラクターの間接化、(iii) 文脈の過負荷です。言語におけるこれらの細かな差異を捉えることで、ObfusQAはLLMの堅牢性と適応性を評価するための包括的なベンチマークを提供します。我々の研究では、LLMがこれらのますます微妙なバリエーションに直面した際、失敗したり虚構の応答を生成したりする傾向があることが観察されました。この方向性の研究を促進するため、ObfusQAteを公開します。

English

The rapid proliferation of Large Language Models (LLMs) has significantly contributed to the development of equitable AI systems capable of factual question-answering (QA). However, no known study tests the LLMs' robustness when presented with obfuscated versions of questions. To systematically evaluate these limitations, we propose a novel technique, ObfusQAte and, leveraging the same, introduce ObfusQA, a comprehensive, first of its kind, framework with multi-tiered obfuscation levels designed to examine LLM capabilities across three distinct dimensions: (i) Named-Entity Indirection, (ii) Distractor Indirection, and (iii) Contextual Overload. By capturing these fine-grained distinctions in language, ObfusQA provides a comprehensive benchmark for evaluating LLM robustness and adaptability. Our study observes that LLMs exhibit a tendency to fail or generate hallucinated responses when confronted with these increasingly nuanced variations. To foster research in this direction, we make ObfusQAte publicly available.

ObfusQAte: 難読化された事実質問応答におけるLLMの堅牢性を評価するための提案フレームワーク

ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

要旨

Support