利用差分隐私大型语言模型进行合成查询生成的隐私保护推荐系统

摘要

我们提出了一种新颖的方法，用于开发隐私保护的大规模推荐系统，采用差分隐私（DP）大型语言模型（LLMs），克服了在DP训练这些复杂系统中的某些挑战和限制。我们的方法特别适用于基于LLM的推荐系统领域的新兴领域，但也可以轻松应用于处理自然语言输入表示的任何推荐系统。我们的方法涉及使用DP训练方法对公开预训练的LLM进行微调，用于查询生成任务。生成的模型可以生成代表原始查询的私有合成查询，这些查询可以自由共享，用于任何下游非私有推荐训练过程，而不会产生额外的隐私成本。我们评估了我们的方法在安全训练有效的深度检索模型方面的能力，我们观察到与直接DP训练检索模型的方法相比，在不损害查询级隐私保证的情况下，它们的检索质量有显著改善。

English

We propose a novel approach for developing privacy-preserving large-scale recommender systems using differentially private (DP) large language models (LLMs) which overcomes certain challenges and limitations in DP training these complex systems. Our method is particularly well suited for the emerging area of LLM-based recommender systems, but can be readily employed for any recommender systems that process representations of natural language inputs. Our approach involves using DP training methods to fine-tune a publicly pre-trained LLM on a query generation task. The resulting model can generate private synthetic queries representative of the original queries which can be freely shared for any downstream non-private recommendation training procedures without incurring any additional privacy cost. We evaluate our method on its ability to securely train effective deep retrieval models, and we observe significant improvements in their retrieval quality without compromising query-level privacy guarantees compared to methods where the retrieval models are directly DP trained.

利用差分隐私大型语言模型进行合成查询生成的隐私保护推荐系统

Privacy-Preserving Recommender Systems with Synthetic Query Generation using Differentially Private Large Language Models

摘要

Support