Ferret:针对大型语言模型的规模化联邦全参数调整
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models
September 10, 2024
作者: Yao Shu, Wenyang Hu, See-Kiong Ng, Bryan Kian Hsiang Low, Fei Richard Yu
cs.AI
摘要
大型语言模型(LLMs)已经成为许多实际应用中不可或缺的。不幸的是,在规模化微调这些模型方面,特别是在联邦设置中,数据隐私和通信效率至关重要,这带来了重大挑战。现有方法通常采用参数高效微调(PEFT)来减少通信开销,但通常会以模型准确性为代价。为了解决这些限制,我们提出了用于大型语言模型的规模化联邦全参数调整(Ferret),这是第一个具有共享随机性的一阶方法,可实现跨分散数据源的大型语言模型的可扩展全参数调整,同时保持竞争性模型准确性。Ferret通过三个方面实现了这一点:(1)采用广泛应用的一阶方法进行高效的本地更新;(2)将这些更新投影到低维空间,大大减少通信开销;(3)利用共享随机性从这个低维空间重构本地更新,以促进有效的全参数全局聚合,确保快速收敛和竞争性最终性能。我们的严格理论分析和见解以及大量实验表明,Ferret通过实现高计算效率、减少通信开销和快速收敛,同时保持竞争性模型准确性,显著增强了现有联邦全参数调整方法的可扩展性。我们的实现可在https://github.com/allen4747/Ferret 上找到。
English
Large Language Models (LLMs) have become indispensable in numerous real-world
applications. Unfortunately, fine-tuning these models at scale, especially in
federated settings where data privacy and communication efficiency are
critical, presents significant challenges. Existing methods often resort to
parameter-efficient fine-tuning (PEFT) to mitigate communication overhead, but
this typically comes at the cost of model accuracy. To address these
limitations, we propose federated full-parameter tuning at scale for LLMs
(Ferret), the first first-order method with shared randomness to enable
scalable full-parameter tuning of LLMs across decentralized data sources while
maintaining competitive model accuracy. Ferret accomplishes this through three
aspects: (1) it employs widely applied first-order methods for efficient local
updates; (2) it projects these updates into a low-dimensional space to
considerably reduce communication overhead; and (3) it reconstructs local
updates from this low-dimensional space with shared randomness to facilitate
effective full-parameter global aggregation, ensuring fast convergence and
competitive final performance. Our rigorous theoretical analyses and insights
along with extensive experiments, show that Ferret significantly enhances the
scalability of existing federated full-parameter tuning approaches by achieving
high computational efficiency, reduced communication overhead, and fast
convergence, all while maintaining competitive model accuracy. Our
implementation is available at https://github.com/allen4747/Ferret.Summary
AI-Generated Summary