大規模言語モデルを活用してプライベートな合成テキストを生成する

要旨

差分プライバシー（DP）を保証する訓練手法、例えばDP-SGDは、機械学習モデルが機密情報を漏洩しないようにすることで、センシティブな訓練データを保護することができます。本論文が検討する別のアプローチは、センシティブなデータセットを使用して、元のデータに対して差分プライバシーを保証する新しい合成データセットを生成する方法です。これにはいくつかの利点があります：合成データは他のタスク（ハイパーパラメータチューニングを含む）に再利用可能で、無期限に保持したり、プライバシーを損なうことなく第三者と共有することができます。しかし、DPデータを取得することは、訓練中にDPを導入するよりもはるかに困難です。テキストデータに対してこれを実現するため、最近の研究では、事前訓練された生成言語モデルを出発点として、センシティブなデータに対してプライベートにファインチューニングする方法が採用されています。このモデルを使用してDP合成データセットをサンプリングすることができます。この戦略は一見単純に見えますが、実行する際に問題が生じることが判明しています。従来のアプローチでは、性能が大幅に低下するか、本論文で示すように重大な設計上の欠陥が存在します。本論文では、適切な訓練目標と、より少ないパラメータのチューニングによって、優れたDP合成データの品質が得られることを実証します。我々のアプローチは、下流タスクにおける性能において、下流分類器の直接的なDP訓練と競合します。また、我々のDP合成データが下流分類器の訓練だけでなく、同じモデルのチューニングにも有用であることも示します。

English

Differentially private (DP) training methods like DP-SGD can protect sensitive training data by ensuring that ML models will not reveal private information. An alternative approach, which this paper studies, is to use a sensitive dataset to generate a new synthetic dataset which is differentially private with respect to the original data. Doing so has several advantages: synthetic data can be reused for other tasks (including for hyper parameter tuning), retained indefinitely, or shared with third parties without sacrificing privacy. However, obtaining DP data is much harder than introducing DP during training. To make it feasible for text, recent work has utilized public data by starting with a pre-trained generative language model and privately finetuning it on sensitive data. This model can be used to sample a DP synthetic dataset. While this strategy seems straightforward, executing it has proven problematic. Previous approaches either show significant performance loss, or have, as we show, critical design flaws. In this paper we demonstrate that a proper training objective along with tuning fewer parameters results in excellent DP synthetic data quality. Our approach is competitive with direct DP-training of downstream classifiers in terms of performance on downstream tasks. We also demonstrate that our DP synthetic data is not only useful for downstream classifier training, but also to tune those same models.

大規模言語モデルを活用してプライベートな合成テキストを生成する

Harnessing large-language models to generate private synthetic text

要旨

Support