대규모 언어 모델을 활용하여 프라이빗 합성 텍스트 생성하기

초록

차등적 프라이버시(DP)를 보장하는 DP-SGD와 같은 훈련 방법은 ML 모델이 민감한 훈련 데이터를 공개하지 않도록 함으로써 데이터의 프라이버시를 보호할 수 있습니다. 본 논문에서 연구하는 대안적인 접근 방식은 민감한 데이터셋을 사용하여 원본 데이터에 대해 차등적 프라이버시를 보장하는 새로운 합성 데이터셋을 생성하는 것입니다. 이 방식은 여러 가지 장점이 있습니다: 합성 데이터는 다른 작업(하이퍼파라미터 튜닝 포함)에 재사용할 수 있고, 무기한 보관하거나 프라이버시를 희생하지 않고 제3자와 공유할 수 있습니다. 그러나 DP 데이터를 얻는 것은 훈련 중에 DP를 도입하는 것보다 훨씬 어렵습니다. 이를 텍스트 데이터에 적용 가능하게 만들기 위해, 최근 연구에서는 사전 훈련된 생성 언어 모델을 시작점으로 사용하고 민감한 데이터에 대해 프라이빗하게 미세 조정하는 방식으로 공개 데이터를 활용했습니다. 이 모델을 사용하여 DP 합성 데이터셋을 샘플링할 수 있습니다. 이 전략은 직관적으로 간단해 보이지만, 실행 과정에서 문제가 발생했습니다. 기존 접근 방식은 상당한 성능 저하를 보이거나, 우리가 보여주듯이 심각한 설계 결함을 가지고 있습니다. 본 논문에서는 적절한 훈련 목표와 더 적은 매개변수 조정을 통해 우수한 DP 합성 데이터 품질을 달성할 수 있음을 입증합니다. 우리의 접근 방식은 다운스트림 작업에서의 성능 측면에서 직접적인 DP 훈련을 통한 분류기와 경쟁력이 있습니다. 또한, 우리의 DP 합성 데이터가 다운스트림 분류기 훈련뿐만 아니라 동일한 모델의 튜닝에도 유용함을 보여줍니다.

English

Differentially private (DP) training methods like DP-SGD can protect sensitive training data by ensuring that ML models will not reveal private information. An alternative approach, which this paper studies, is to use a sensitive dataset to generate a new synthetic dataset which is differentially private with respect to the original data. Doing so has several advantages: synthetic data can be reused for other tasks (including for hyper parameter tuning), retained indefinitely, or shared with third parties without sacrificing privacy. However, obtaining DP data is much harder than introducing DP during training. To make it feasible for text, recent work has utilized public data by starting with a pre-trained generative language model and privately finetuning it on sensitive data. This model can be used to sample a DP synthetic dataset. While this strategy seems straightforward, executing it has proven problematic. Previous approaches either show significant performance loss, or have, as we show, critical design flaws. In this paper we demonstrate that a proper training objective along with tuning fewer parameters results in excellent DP synthetic data quality. Our approach is competitive with direct DP-training of downstream classifiers in terms of performance on downstream tasks. We also demonstrate that our DP synthetic data is not only useful for downstream classifier training, but also to tune those same models.

대규모 언어 모델을 활용하여 프라이빗 합성 텍스트 생성하기

Harnessing large-language models to generate private synthetic text

초록

Support