立場：隱私不僅僅是記憶問題！

摘要

關於大型語言模型（LLMs）隱私風險的討論，過度集中於訓練數據的逐字記憶，而一系列更為直接且可擴展的隱私威脅卻未得到充分探討。本立場文件主張，LLM系統的隱私格局遠超訓練數據提取，涵蓋了數據收集實踐、推理時上下文洩露、自主代理能力，以及通過深度推理攻擊實現的監控民主化等風險。我們提出了一個全面的隱私風險分類體系，涵蓋LLM生命週期——從數據收集到部署——並通過案例研究展示當前隱私框架如何未能應對這些多方面的威脅。通過對過去十年（2016-2025）在頂級會議上發表的1,322篇AI/ML隱私論文的縱向分析，我們揭示出，儘管記憶問題在技術研究中受到過度關注，但最迫切的隱私危害卻存在於其他領域，當前技術方法在這些領域幾乎無能為力，可行的前進道路仍不明朗。我們呼籲研究界在處理LLM隱私問題時進行根本性轉變，超越當前技術解決方案的狹隘視角，採納跨學科方法，以應對這些新興威脅的社會技術本質。

English

The discourse on privacy risks in Large Language Models (LLMs) has disproportionately focused on verbatim memorization of training data, while a constellation of more immediate and scalable privacy threats remain underexplored. This position paper argues that the privacy landscape of LLM systems extends far beyond training data extraction, encompassing risks from data collection practices, inference-time context leakage, autonomous agent capabilities, and the democratization of surveillance through deep inference attacks. We present a comprehensive taxonomy of privacy risks across the LLM lifecycle -- from data collection through deployment -- and demonstrate through case studies how current privacy frameworks fail to address these multifaceted threats. Through a longitudinal analysis of 1,322 AI/ML privacy papers published at leading conferences over the past decade (2016--2025), we reveal that while memorization receives outsized attention in technical research, the most pressing privacy harms lie elsewhere, where current technical approaches offer little traction and viable paths forward remain unclear. We call for a fundamental shift in how the research community approaches LLM privacy, moving beyond the narrow focus of current technical solutions and embracing interdisciplinary approaches that address the sociotechnical nature of these emerging threats.

立場：隱私不僅僅是記憶問題！

Position: Privacy Is Not Just Memorization!

摘要

Support