ChatPaper.aiChatPaper

FIAT:將學習範式與指導加速調整融合

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

September 9, 2023
作者: Xinyi Wang, John Wieting, Jonathan H. Clark
cs.AI

摘要

大型語言模型(LLMs)的學習範式目前主要分為上下文學習(ICL)和完全微調兩種。每種方法都有其基於可用數據、模型大小、計算成本、易用性和最終質量的取捨,但沒有一種方法能夠全面表現良好。在本文中,我們首先描述了ICL和微調範式,突顯它們之間的自然聯繫。基於這些聯繫,我們提出了一種名為FIAT的新學習範式,將這些範式的優點融合在一起,實現了大型模型的即時工程指令和思維鏈推理,同時還使用類似的方法對具有參數高效調整的中等大小LLM執行參數更新。我們在各種多語言任務上評估了FIAT的有效性,觀察到FIAT在100-10,000個訓練示例範圍內的表現優於ICL和微調。我們希望FIAT提供了一種實用的方式,可以充分發揮LLMs的潛力,而無需在學習範式之間做出艱難的選擇。
English
Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning. Each of these comes with their own trade-offs based on available data, model size, compute cost, ease-of-use, and final quality with neither solution performing well across-the-board. In this article, we first describe ICL and fine-tuning paradigms in a way that highlights their natural connections. Based on these connections, we propose a new learning paradigm called FIAT that fuses the best of these paradigms together, enabling prompt-engineered instructions and chain-of-thought reasoning with the very largest models while also using similar methods to perform parameter updates on a modestly-sized LLM with parameter-efficient tuning. We evaluate FIAT's effectiveness on a variety of multilingual tasks and observe that FIAT performs better than both ICL and fine-tuning at scales ranging from 100-10,000 training examples. We hope that FIAT provides a practical way of harnessing the full potential of LLMs without needing to make a hard choice between learning paradigms.
PDF60December 15, 2024