プライバシー意識の高いアシスタントにおける文脈的整合性の運用化

要旨

先進的なAIアシスタントは、最先端の大規模言語モデル（LLM）とツールへのアクセスを組み合わせ、ユーザーの代わりに複雑なタスクを自律的に実行します。このようなアシスタントの有用性は、メールやドキュメントを含むユーザー情報へのアクセスによって大幅に向上する可能性がありますが、これにより、ユーザーの監督なしにアシスタントが第三者に不適切な情報を共有するというプライバシー上の懸念が生じます。プライバシー期待に沿った情報共有を実現するために、私たちは「文脈的整合性（Contextual Integrity, CI）」というフレームワークを運用化することを提案します。CIは、特定の文脈における情報の適切な流れをプライバシーと同等と見なすものです。特に、アシスタントの情報共有行動をCIに準拠させるための複数の戦略を設計し、評価しました。評価は、合成データと人間による注釈で構成された新しいフォーム記入ベンチマークに基づいて行われ、最先端のLLMにCIに基づく推論を促すことが強力な結果をもたらすことが明らかになりました。

English

Advanced AI assistants combine frontier LLMs and tool access to autonomously perform complex tasks on behalf of users. While the helpfulness of such assistants can increase dramatically with access to user information including emails and documents, this raises privacy concerns about assistants sharing inappropriate information with third parties without user supervision. To steer information-sharing assistants to behave in accordance with privacy expectations, we propose to operationalize contextual integrity (CI), a framework that equates privacy with the appropriate flow of information in a given context. In particular, we design and evaluate a number of strategies to steer assistants' information-sharing actions to be CI compliant. Our evaluation is based on a novel form filling benchmark composed of synthetic data and human annotations, and it reveals that prompting frontier LLMs to perform CI-based reasoning yields strong results.

プライバシー意識の高いアシスタントにおける文脈的整合性の運用化

Operationalizing Contextual Integrity in Privacy-Conscious Assistants

要旨

Support