Когда объяснимость встречается с конфиденциальностью: исследование на стыке постфактумной объяснимости и дифференциальной приватности в контексте обработки естественного языка

Аннотация

В исследованиях, посвященных надежной обработке естественного языка (Natural Language Processing, NLP), выделился ряд важных направлений, включая объяснимость и конфиденциальность. Хотя интерес к объяснимым и сохраняющим конфиденциальность методам NLP значительно возрос в последние годы, на стыке этих двух областей остается недостаточно исследований. Это создает существенный пробел в понимании того, возможно ли одновременно достичь как объяснимости, так и конфиденциальности, или же эти цели противоречат друг другу. В данной работе мы проводим эмпирическое исследование компромисса между конфиденциальностью и объяснимостью в контексте NLP, опираясь на популярные общие методы дифференциальной конфиденциальности (Differential Privacy, DP) и постфактумной объяснимости. Наши результаты проливают свет на сложную взаимосвязь между конфиденциальностью и объяснимостью, которая формируется под влиянием ряда факторов, включая характер решаемой задачи и выбор методов приватизации текста и объяснимости. Мы подчеркиваем возможность сосуществования конфиденциальности и объяснимости и обобщаем наши выводы в виде практических рекомендаций для будущих исследований на этом важном стыке.

English

In the study of trustworthy Natural Language Processing (NLP), a number of important research fields have emerged, including that of explainability and privacy. While research interest in both explainable and privacy-preserving NLP has increased considerably in recent years, there remains a lack of investigation at the intersection of the two. This leaves a considerable gap in understanding of whether achieving both explainability and privacy is possible, or whether the two are at odds with each other. In this work, we conduct an empirical investigation into the privacy-explainability trade-off in the context of NLP, guided by the popular overarching methods of Differential Privacy (DP) and Post-hoc Explainability. Our findings include a view into the intricate relationship between privacy and explainability, which is formed by a number of factors, including the nature of the downstream task and choice of the text privatization and explainability method. In this, we highlight the potential for privacy and explainability to co-exist, and we summarize our findings in a collection of practical recommendations for future work at this important intersection.

When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing

Аннотация

Support