RiemannLoRA: 모호성 없는 LoRA 최적화를 위한 통일된 리만 프레임워크

초록

저순위 적응(Low-Rank Adaptation, LoRA)은 대규모 언어 모델(LLMs)의 매개변수 효율적 미세 조정을 위한 널리 채택된 표준으로, 메모리 및 계산 요구 사항을 크게 줄여왔습니다. 그러나 최적의 초기화 전략을 찾거나 저순위 행렬 분해에서의 과다 매개변수화를 완화하는 등의 과제가 여전히 남아 있습니다. 본 연구에서는 이러한 두 가지 과제를 통합된 프레임워크 내에서 동시에 해결하는 새로운 접근 방식을 제안합니다. 우리의 방법은 고정 순위의 LoRA 행렬 집합을 매끄러운 다양체로 취급합니다. 이 다양체 상의 요소로서 어댑터를 고려함으로써 과다 매개변수화를 제거하고, 다양체를 따라 손실이 가장 빠르게 감소하는 방향을 결정함으로써 초기화를 제공합니다. 수치 선형 대수학과 리만 최적화의 최선의 실천 방법을 사용하여 우리의 방법을 수치적으로 안정적이고 계산적으로 효율적으로 구현하기 위해 특별한 주의를 기울였습니다. LLM 및 확산 모델 아키텍처에 대한 실험 결과는 RiemannLoRA가 표준 LoRA 및 최신 수정 버전에 비해 수렴 속도와 최종 성능 모두에서 지속적으로 개선됨을 보여줍니다.

English

Low-Rank Adaptation (LoRA) has become a widely adopted standard for parameter-efficient fine-tuning of large language models (LLMs), significantly reducing memory and computational demands. However, challenges remain, including finding optimal initialization strategies or mitigating overparametrization in low-rank matrix factorization. In this work, we propose a novel approach that addresses both of the challenges simultaneously within a unified framework. Our method treats a set of fixed-rank LoRA matrices as a smooth manifold. Considering adapters as elements on this manifold removes overparametrization, while determining the direction of the fastest loss decrease along the manifold provides initialization. Special care is taken to obtain numerically stable and computationally efficient implementation of our method, using best practices from numerical linear algebra and Riemannian optimization. Experimental results on LLM and diffusion model architectures demonstrate that RiemannLoRA consistently improves both convergence speed and final performance over standard LoRA and its state-of-the-art modifications.

RiemannLoRA: 모호성 없는 LoRA 최적화를 위한 통일된 리만 프레임워크

RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization

초록

Support