드래그 앤 드롭 LLM: 제로샷 프롬프트-투-웨이트

초록

현대의 파라미터 효율적 미세 조정(Parameter-Efficient Fine-Tuning, PEFT) 방법론 중 하나인 저순위 적응(Low-Rank Adaptation, LoRA)은 대규모 언어 모델(Large Language Models, LLMs)을 맞춤화하는 비용을 줄이지만, 여전히 각 다운스트림 데이터셋에 대해 별도의 최적화 과정을 필요로 합니다. 우리는 Drag-and-Drop LLMs(\textit{DnD})를 소개합니다. 이는 프롬프트 조건 파라미터 생성기로, 소수의 레이블 없는 작업 프롬프트를 직접 LoRA 가중치 업데이트로 매핑하여 작업별 훈련을 제거합니다. 경량화된 텍스트 인코더는 각 프롬프트 배치를 조건 임베딩으로 압축하고, 이를 계단식 초합성곱 디코더를 통해 전체 LoRA 행렬 세트로 변환합니다. 다양한 프롬프트-체크포인트 쌍으로 훈련된 후, DnD는 몇 초 내에 작업별 파라미터를 생성하며, i) 전체 미세 조정 대비 최대 12,000배 낮은 오버헤드, ii) 보이지 않은 상식 추론, 수학, 코딩, 멀티모달 벤치마크에서 가장 강력한 훈련된 LoRA 대비 평균 30%의 성능 향상, iii) 대상 데이터나 레이블을 본 적 없음에도 강력한 도메인 간 일반화 능력을 보여줍니다. 우리의 결과는 프롬프트 조건 파라미터 생성이 그래디언트 기반 적응의 대안으로 LLM을 빠르게 특수화하는 데 유효함을 입증합니다. 우리의 프로젝트는 https://jerryliang24.github.io/DnD{https://jerryliang24.github.io/DnD}에서 확인할 수 있습니다.

English

Modern Parameter-Efficient Fine-Tuning (PEFT) methods such as low-rank adaptation (LoRA) reduce the cost of customizing large language models (LLMs), yet still require a separate optimization run for every downstream dataset. We introduce Drag-and-Drop LLMs (\textit{DnD)}, a prompt-conditioned parameter generator that eliminates per-task training by mapping a handful of unlabeled task prompts directly to LoRA weight updates. A lightweight text encoder distills each prompt batch into condition embeddings, which are then transformed by a cascaded hyper-convolutional decoder into the full set of LoRA matrices. Once trained in a diverse collection of prompt-checkpoint pairs, DnD produces task-specific parameters in seconds, yielding i) up to 12,000times lower overhead than full fine-tuning, ii) average gains up to 30\% in performance over the strongest training LoRAs on unseen common-sense reasoning, math, coding, and multimodal benchmarks, and iii) robust cross-domain generalization despite never seeing the target data or labels. Our results demonstrate that prompt-conditioned parameter generation is a viable alternative to gradient-based adaptation for rapidly specializing LLMs. Our project is available at https://jerryliang24.github.io/DnD{https://jerryliang24.github.io/DnD}.

드래그 앤 드롭 LLM: 제로샷 프롬프트-투-웨이트

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

초록

Support