低秩适应中的子空间混合
Mixture-of-Subspaces in Low-Rank Adaptation
June 16, 2024
作者: Taiqiang Wu, Jiahao Wang, Zhe Zhao, Ngai Wong
cs.AI
摘要
本文介绍了一种基于子空间的低秩适应(LoRA)方法,该方法在计算效率高、易于实现,并且适用于大型语言、多模态和扩散模型。首先,我们将LoRA的权重等效分解为两个子空间,并发现简单地混合它们可以提高性能。为了研究这种现象,我们通过一个细粒度的子空间视角重新审视它,表明这种修改等效于使用一个固定的混合器来融合子空间。为了更灵活,我们联合学习了混合器和原始的LoRA权重,并将该方法称为子空间混合LoRA(MoSLoRA)。MoSLoRA在不同模态的任务上始终优于LoRA,包括常识推理、视觉指导微调和主题驱动的文本到图像生成,展示了其有效性和稳健性。代码可在 https://github.com/wutaiqiang/MoSLoRA{github} 获取。
English
In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA)
method, which is computationally efficient, easy to implement, and readily
applicable to large language, multimodal, and diffusion models. Initially, we
equivalently decompose the weights of LoRA into two subspaces, and find that
simply mixing them can enhance performance. To study such a phenomenon, we
revisit it through a fine-grained subspace lens, showing that such modification
is equivalent to employing a fixed mixer to fuse the subspaces. To be more
flexible, we jointly learn the mixer with the original LoRA weights, and term
the method Mixture-of-Subspaces LoRA (MoSLoRA). MoSLoRA consistently
outperforms LoRA on tasks in different modalities, including commonsense
reasoning, visual instruction tuning, and subject-driven text-to-image
generation, demonstrating its effectiveness and robustness. Codes are available
at https://github.com/wutaiqiang/MoSLoRA{github}.Summary
AI-Generated Summary