VeRA: 벡터 기반 랜덤 행렬 적응

초록

저순위 적응(LoRA)은 대규모 언어 모델을 미세 조정할 때 학습 가능한 매개변수의 수를 줄이는 인기 있는 방법이지만, 더 큰 모델로 확장하거나 사용자별 또는 작업별로 적응된 모델을 다수 배포할 때 심각한 저장 공간 문제에 직면합니다. 본 연구에서는 Vector-based Random Matrix Adaptation(VeRA)을 제안하며, 이는 LoRA 대비 학습 가능한 매개변수를 10배 줄이면서도 동일한 성능을 유지합니다. 이를 위해 모든 계층에서 공유되는 단일 쌍의 저순위 행렬을 사용하고, 대신 작은 스케일링 벡터를 학습합니다. 우리는 GLUE 및 E2E 벤치마크에서 VeRA의 효과를 입증하고, Llama2 7B 모델을 사용하여 단 1.4M 매개변수로 지시 따르기 작업에 적용한 사례를 보여줍니다.

English

Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models. In this work, we present Vector-based Random Matrix Adaptation (VeRA), which reduces the number of trainable parameters by 10x compared to LoRA, yet maintains the same performance. It achieves this by using a single pair of low-rank matrices shared across all layers and learning small scaling vectors instead. We demonstrate its effectiveness on the GLUE and E2E benchmarks, and show its application in instruction-following with just 1.4M parameters using the Llama2 7B model.

VeRA: 벡터 기반 랜덤 행렬 적응

VeRA: Vector-based Random Matrix Adaptation

초록

Support