使用差異隱私大型語言模型進行合成查詢生成的隱私保護推薦系統
Privacy-Preserving Recommender Systems with Synthetic Query Generation using Differentially Private Large Language Models
May 10, 2023
作者: Aldo Gael Carranza, Rezsa Farahani, Natalia Ponomareva, Alex Kurakin, Matthew Jagielski, Milad Nasr
cs.AI
摘要
我們提出了一種新穎的方法,用於開發隱私保護的大規模推薦系統,採用差分隱私(DP)大型語言模型(LLMs),克服了在DP訓練這些複雜系統時的某些挑戰和限制。我們的方法特別適用於基於LLM的推薦系統新興領域,但也可以輕鬆應用於處理自然語言輸入表示的任何推薦系統。我們的方法涉及使用DP訓練方法對公開預訓練的LLM進行微調,用於查詢生成任務。生成的模型可以產生代表原始查詢的私有合成查詢,這些查詢可以自由共享,用於任何下游非私有推薦訓練程序,而不會產生任何額外的隱私成本。我們評估我們的方法在安全訓練有效的深度檢索模型方面的能力,我們觀察到與直接DP訓練檢索模型的方法相比,在不損害查詢級隱私保證的情況下,它們的檢索質量有顯著改善。
English
We propose a novel approach for developing privacy-preserving large-scale
recommender systems using differentially private (DP) large language models
(LLMs) which overcomes certain challenges and limitations in DP training these
complex systems. Our method is particularly well suited for the emerging area
of LLM-based recommender systems, but can be readily employed for any
recommender systems that process representations of natural language inputs.
Our approach involves using DP training methods to fine-tune a publicly
pre-trained LLM on a query generation task. The resulting model can generate
private synthetic queries representative of the original queries which can be
freely shared for any downstream non-private recommendation training procedures
without incurring any additional privacy cost. We evaluate our method on its
ability to securely train effective deep retrieval models, and we observe
significant improvements in their retrieval quality without compromising
query-level privacy guarantees compared to methods where the retrieval models
are directly DP trained.