RiemannLoRA:一种无歧义LoRA优化的统一黎曼框架
RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization
July 16, 2025
作者: Vladimir Bogachev, Vladimir Aletov, Alexander Molozhavenko, Denis Bobkov, Vera Soboleva, Aibek Alanov, Maxim Rakhuba
cs.AI
摘要
低秩适应(LoRA)已成为大规模语言模型(LLMs)参数高效微调的广泛采用标准,显著降低了内存和计算需求。然而,挑战依然存在,包括寻找最优初始化策略或缓解低秩矩阵分解中的过参数化问题。在本研究中,我们提出了一种新颖方法,在一个统一框架内同时解决这两个挑战。我们的方法将一组固定秩的LoRA矩阵视为一个光滑流形。将适配器视为该流形上的元素可消除过参数化,而沿流形确定损失下降最快的方向则提供了初始化策略。我们特别注重采用数值线性代数和黎曼优化的最佳实践,以确保方法的数值稳定性和计算效率。在LLM和扩散模型架构上的实验结果表明,RiemannLoRA在收敛速度和最终性能上均持续优于标准LoRA及其最先进的改进版本。
English
Low-Rank Adaptation (LoRA) has become a widely adopted standard for
parameter-efficient fine-tuning of large language models (LLMs), significantly
reducing memory and computational demands. However, challenges remain,
including finding optimal initialization strategies or mitigating
overparametrization in low-rank matrix factorization. In this work, we propose
a novel approach that addresses both of the challenges simultaneously within a
unified framework. Our method treats a set of fixed-rank LoRA matrices as a
smooth manifold. Considering adapters as elements on this manifold removes
overparametrization, while determining the direction of the fastest loss
decrease along the manifold provides initialization. Special care is taken to
obtain numerically stable and computationally efficient implementation of our
method, using best practices from numerical linear algebra and Riemannian
optimization. Experimental results on LLM and diffusion model architectures
demonstrate that RiemannLoRA consistently improves both convergence speed and
final performance over standard LoRA and its state-of-the-art modifications.