《自适应量化与差分隐私在非独立同分布联邦学习中的隐私增强与通信效率优化》

摘要

联邦学习（FL）是一种分布式机器学习方法，允许多个设备在中央服务器管理下协作训练模型而无需共享底层数据。FL面临的关键挑战之一是因设备间连接速度和带宽差异造成的通信瓶颈，因此必须减少训练过程中的传输数据量。此外，训练过程中存在通过模型或梯度分析暴露敏感信息的潜在风险。为同时解决隐私保护和通信效率问题，我们结合差分隐私（DP）与自适应量化方法：采用基于拉普拉斯机制的DP技术保障隐私（该方法在FL研究中相对未被充分探索，且比高斯机制DP提供更严格的隐私保证）；提出基于轮次的余弦退火全局位长调度器，以及通过数据集熵分析估计客户端贡献度实现动态调整的客户端调度器。我们在CIFAR10、MNIST和医学影像数据集上进行了广泛实验，测试了不同客户端数量、位长调度策略和隐私预算下的非独立同分布数据场景。结果表明：相较于32位浮点训练，自适应量化方法在保持竞争性模型精度并通过差分隐私确保强隐私保护的同时，使MNIST总通信数据量降低52.64%，CIFAR10降低45.06%，医学影像数据集降低31%至37%。

English

Federated learning (FL) is a distributed machine learning method where multiple devices collaboratively train a model under the management of a central server without sharing underlying data. One of the key challenges of FL is the communication bottleneck caused by variations in connection speed and bandwidth across devices. Therefore, it is essential to reduce the size of transmitted data during training. Additionally, there is a potential risk of exposing sensitive information through the model or gradient analysis during training. To address both privacy and communication efficiency, we combine differential privacy (DP) and adaptive quantization methods. We use Laplacian-based DP to preserve privacy, which is relatively underexplored in FL and offers tighter privacy guarantees than Gaussian-based DP. We propose a simple and efficient global bit-length scheduler using round-based cosine annealing, along with a client-based scheduler that dynamically adapts based on client contribution estimated through dataset entropy analysis. We evaluate our approach through extensive experiments on CIFAR10, MNIST, and medical imaging datasets, using non-IID data distributions across varying client counts, bit-length schedulers, and privacy budgets. The results show that our adaptive quantization methods reduce total communicated data by up to 52.64% for MNIST, 45.06% for CIFAR10, and 31% to 37% for medical imaging datasets compared to 32-bit float training while maintaining competitive model accuracy and ensuring robust privacy through differential privacy.

《自适应量化与差分隐私在非独立同分布联邦学习中的隐私增强与通信效率优化》

Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy

摘要

Support