ChatPaper.aiChatPaper

PrivacyLens:评估语言模型在行动中对隐私规范意识的影响

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action

August 29, 2024
作者: Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
cs.AI

摘要

随着语言模型(LMs)在个性化通信场景中被广泛应用(例如发送电子邮件、撰写社交媒体帖子)并赋予一定程度的代理能力,确保它们遵循上下文隐私规范变得日益关键。然而,由于隐私敏感案例具有上下文和长尾特性,并且缺乏捕捉现实应用场景的评估方法,量化LMs的隐私规范意识和LMs介导的通信中新兴隐私风险具有挑战性。为了解决这些挑战,我们提出了PrivacyLens,这是一个新颖的框架,旨在将隐私敏感种子扩展为富有表现力的小品,进而延伸至代理轨迹,实现对LM代理行为中隐私泄漏的多层次评估。我们在PrivacyLens中实例化了一系列基于隐私文献和众包种子的隐私规范。利用这一数据集,我们揭示了LM在回答探究性问题和在代理设置中执行用户指令时的实际行为之间的差异。像GPT-4和Llama-3-70B这样的最先进LMs,在25.68%和38.69%的情况下泄露敏感信息,即使在提示使用隐私增强指令时也是如此。我们还通过将每个种子扩展为多个轨迹来展示PrivacyLens的动态特性,以对抗LM的隐私泄漏风险。数据集和代码可在https://github.com/SALT-NLP/PrivacyLens 上获得。
English
As language models (LMs) are widely utilized in personalized communication scenarios (e.g., sending emails, writing social media posts) and endowed with a certain level of agency, ensuring they act in accordance with the contextual privacy norms becomes increasingly critical. However, quantifying the privacy norm awareness of LMs and the emerging privacy risk in LM-mediated communication is challenging due to (1) the contextual and long-tailed nature of privacy-sensitive cases, and (2) the lack of evaluation approaches that capture realistic application scenarios. To address these challenges, we propose PrivacyLens, a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories, enabling multi-level evaluation of privacy leakage in LM agents' actions. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. Using this dataset, we reveal a discrepancy between LM performance in answering probing questions and their actual behavior when executing user instructions in an agent setup. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions. We also demonstrate the dynamic nature of PrivacyLens by extending each seed into multiple trajectories to red-team LM privacy leakage risk. Dataset and code are available at https://github.com/SALT-NLP/PrivacyLens.

Summary

AI-Generated Summary

PDF12November 16, 2024