RiemannLoRA：一种无歧义LoRA优化的统一黎曼框架

摘要

低秩适应（LoRA）已成为大规模语言模型（LLMs）参数高效微调的广泛采用标准，显著降低了内存和计算需求。然而，挑战依然存在，包括寻找最优初始化策略或缓解低秩矩阵分解中的过参数化问题。在本研究中，我们提出了一种新颖方法，在一个统一框架内同时解决这两个挑战。我们的方法将一组固定秩的LoRA矩阵视为一个光滑流形。将适配器视为该流形上的元素可消除过参数化，而沿流形确定损失下降最快的方向则提供了初始化策略。我们特别注重采用数值线性代数和黎曼优化的最佳实践，以确保方法的数值稳定性和计算效率。在LLM和扩散模型架构上的实验结果表明，RiemannLoRA在收敛速度和最终性能上均持续优于标准LoRA及其最先进的改进版本。

English

Low-Rank Adaptation (LoRA) has become a widely adopted standard for parameter-efficient fine-tuning of large language models (LLMs), significantly reducing memory and computational demands. However, challenges remain, including finding optimal initialization strategies or mitigating overparametrization in low-rank matrix factorization. In this work, we propose a novel approach that addresses both of the challenges simultaneously within a unified framework. Our method treats a set of fixed-rank LoRA matrices as a smooth manifold. Considering adapters as elements on this manifold removes overparametrization, while determining the direction of the fastest loss decrease along the manifold provides initialization. Special care is taken to obtain numerically stable and computationally efficient implementation of our method, using best practices from numerical linear algebra and Riemannian optimization. Experimental results on LLM and diffusion model architectures demonstrate that RiemannLoRA consistently improves both convergence speed and final performance over standard LoRA and its state-of-the-art modifications.

RiemannLoRA：一种无歧义LoRA优化的统一黎曼框架

RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization

摘要

Support