GraLoRA: 파라미터 효율적 미세 조정을 위한 세분화된 저순위 적응

초록

Low-Rank Adaptation (LoRA)은 생성 모델의 매개변수 효율적 미세 조정(PEFT)을 위한 인기 있는 방법으로, 그 간결함과 효과성으로 높이 평가받고 있습니다. 최근의 개선에도 불구하고, LoRA는 여전히 근본적인 한계를 가지고 있습니다: 병목 현상이 확장될 때 과적합이 발생한다는 점입니다. LoRA는 랭크 32-64에서 최고의 성능을 발휘하지만, 더 높은 랭크에서는 정확도가 정체되거나 감소하며, 여전히 완전 미세 조정(FFT)의 성능에 미치지 못합니다. 우리는 이 문제의 근본 원인을 LoRA의 구조적 병목 현상으로 파악했는데, 이는 관련 없는 입력 채널에 기울기 얽힘을 도입하고 기울기 전파를 왜곡시킵니다. 이를 해결하기 위해, 우리는 새로운 구조인 Granular Low-Rank Adaptation (GraLoRA)를 소개합니다. GraLoRA는 가중치 행렬을 하위 블록으로 분할하고, 각 블록에 자체적인 저랭크 어댑터를 부여합니다. GraLoRA는 계산 또는 저장 비용을 거의 들이지 않고 LoRA의 한계를 극복하며, 표현 능력을 효과적으로 증가시키고 FFT 동작에 더 가깝게 근사합니다. 코드 생성 및 상식 추론 벤치마크에서의 실험은 GraLoRA가 LoRA 및 다른 기준선을 일관되게 능가하며, HumanEval+에서 Pass@1 기준 최대 +8.5%의 절대적 성능 향상을 달성함을 보여줍니다. 이러한 개선은 모델 크기와 랭크 설정에 걸쳐 유지되며, GraLoRA를 PEFT를 위한 확장 가능하고 강력한 솔루션으로 만듭니다. 코드, 데이터 및 스크립트는 https://github.com/SqueezeBits/GraLoRA.git에서 확인할 수 있습니다.

English

Low-Rank Adaptation (LoRA) is a popular method for parameter-efficient fine-tuning (PEFT) of generative models, valued for its simplicity and effectiveness. Despite recent enhancements, LoRA still suffers from a fundamental limitation: overfitting when the bottleneck is widened. It performs best at ranks 32-64, yet its accuracy stagnates or declines at higher ranks, still falling short of full fine-tuning (FFT) performance. We identify the root cause as LoRA's structural bottleneck, which introduces gradient entanglement to the unrelated input channels and distorts gradient propagation. To address this, we introduce a novel structure, Granular Low-Rank Adaptation (GraLoRA) that partitions weight matrices into sub-blocks, each with its own low-rank adapter. With negligible computational or storage cost, GraLoRA overcomes LoRA's limitations, effectively increases the representational capacity, and more closely approximates FFT behavior. Experiments on code generation and commonsense reasoning benchmarks show that GraLoRA consistently outperforms LoRA and other baselines, achieving up to +8.5% absolute gain in Pass@1 on HumanEval+. These improvements hold across model sizes and rank settings, making GraLoRA a scalable and robust solution for PEFT. Code, data, and scripts are available at https://github.com/SqueezeBits/GraLoRA.git

GraLoRA: 파라미터 효율적 미세 조정을 위한 세분화된 저순위 적응

GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

초록

Support