ドラッグアンドドロップLLM：ゼロショットプロンプトから重みへ

要旨

現代のパラメータ効率的なファインチューニング（PEFT）手法、例えば低ランク適応（LoRA）は、大規模言語モデル（LLM）のカスタマイズコストを削減しますが、依然として各下流データセットに対して個別の最適化実行が必要です。本論文では、Drag-and-Drop LLMs（\textit{DnD}）を紹介します。これは、プロンプト条件付きパラメータ生成器であり、少数のラベルなしタスクプロンプトを直接LoRA重み更新にマッピングすることで、タスクごとのトレーニングを不要にします。軽量なテキストエンコーダが各プロンプトバッチを条件埋め込みに蒸留し、その後、カスケード型ハイパー畳み込みデコーダによって完全なLoRA行列セットに変換されます。多様なプロンプト-チェックポイントペアのコレクションでトレーニングされた後、DnDは数秒でタスク固有のパラメータを生成し、i) 完全なファインチューニングに比べて最大12,000倍のオーバーヘッド削減、ii) 未見の常識推論、数学、コーディング、およびマルチモーダルベンチマークにおいて、最強のトレーニングLoRAを平均30％上回る性能向上、iii) ターゲットデータやラベルを一度も見ていないにもかかわらず、堅牢なクロスドメイン汎化を実現します。我々の結果は、プロンプト条件付きパラメータ生成が、勾配ベースの適応に代わる迅速なLLM専門化の有効な代替手段であることを示しています。本プロジェクトはhttps://jerryliang24.github.io/DnD{https://jerryliang24.github.io/DnD}で公開されています。

English

Modern Parameter-Efficient Fine-Tuning (PEFT) methods such as low-rank adaptation (LoRA) reduce the cost of customizing large language models (LLMs), yet still require a separate optimization run for every downstream dataset. We introduce Drag-and-Drop LLMs (\textit{DnD)}, a prompt-conditioned parameter generator that eliminates per-task training by mapping a handful of unlabeled task prompts directly to LoRA weight updates. A lightweight text encoder distills each prompt batch into condition embeddings, which are then transformed by a cascaded hyper-convolutional decoder into the full set of LoRA matrices. Once trained in a diverse collection of prompt-checkpoint pairs, DnD produces task-specific parameters in seconds, yielding i) up to 12,000times lower overhead than full fine-tuning, ii) average gains up to 30\% in performance over the strongest training LoRAs on unseen common-sense reasoning, math, coding, and multimodal benchmarks, and iii) robust cross-domain generalization despite never seeing the target data or labels. Our results demonstrate that prompt-conditioned parameter generation is a viable alternative to gradient-based adaptation for rapidly specializing LLMs. Our project is available at https://jerryliang24.github.io/DnD{https://jerryliang24.github.io/DnD}.

ドラッグアンドドロップLLM：ゼロショットプロンプトから重みへ

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

要旨

Support