利用大型语言模型生成私密合成文本

摘要

差分隐私（DP）训练方法，如DP-SGD，可以保护敏感训练数据，确保机器学习模型不会泄露私人信息。本文研究的另一种方法是使用敏感数据集生成新的合成数据集，该数据集相对于原始数据是具有差分隐私性质的。这样做有几个优点：合成数据可用于其他任务（包括超参数调整），可以无限期保留，或与第三方共享而不损害隐私。然而，获取差分隐私数据比在训练过程中引入差分隐私要困难得多。为了使其对文本可行，最近的研究利用公共数据，从预训练的生成语言模型开始，然后在敏感数据上进行私人微调。这个模型可以用来采样差分隐私的合成数据集。虽然这种策略看起来很简单，但实施起来却存在问题。先前的方法要么表现出明显的性能损失，要么像我们展示的那样存在关键设计缺陷。在本文中，我们展示了一个适当的训练目标以及调整更少参数会产生出色的差分隐私合成数据质量。我们的方法在下游分类器的性能方面与直接进行差分隐私训练相竞争。我们还展示了我们的差分隐私合成数据不仅对下游分类器训练有用，而且对调整这些模型也很有帮助。

English

Differentially private (DP) training methods like DP-SGD can protect sensitive training data by ensuring that ML models will not reveal private information. An alternative approach, which this paper studies, is to use a sensitive dataset to generate a new synthetic dataset which is differentially private with respect to the original data. Doing so has several advantages: synthetic data can be reused for other tasks (including for hyper parameter tuning), retained indefinitely, or shared with third parties without sacrificing privacy. However, obtaining DP data is much harder than introducing DP during training. To make it feasible for text, recent work has utilized public data by starting with a pre-trained generative language model and privately finetuning it on sensitive data. This model can be used to sample a DP synthetic dataset. While this strategy seems straightforward, executing it has proven problematic. Previous approaches either show significant performance loss, or have, as we show, critical design flaws. In this paper we demonstrate that a proper training objective along with tuning fewer parameters results in excellent DP synthetic data quality. Our approach is competitive with direct DP-training of downstream classifiers in terms of performance on downstream tasks. We also demonstrate that our DP synthetic data is not only useful for downstream classifier training, but also to tune those same models.

利用大型语言模型生成私密合成文本

Harnessing large-language models to generate private synthetic text

摘要

Support