ChatPaper.aiChatPaper

FIAT:将学习范式与指导加速调整融合在一起

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

September 9, 2023
作者: Xinyi Wang, John Wieting, Jonathan H. Clark
cs.AI

摘要

大型语言模型(LLMs)的学习范式目前主要分为上下文学习(ICL)和完全微调两种。每种方法都有其基于可用数据、模型大小、计算成本、易用性和最终质量的权衡,但没有一种方法能够在所有方面表现良好。在本文中,我们首先描述了ICL和微调范式,突出它们之间的自然联系。基于这些联系,我们提出了一种名为FIAT的新学习范式,将这些范式的优点融合在一起,实现了使用最大模型进行快速工程化指令和思维链推理,同时还利用类似的方法在参数高效调整的中等大小LLM上执行参数更新。我们评估了FIAT在各种多语言任务上的有效性,并观察到,在100-10,000个训练示例的规模范围内,FIAT的表现优于ICL和微调。我们希望FIAT提供了一种实用的方式,可以充分利用LLMs的潜力,而无需在学习范式之间做出艰难的选择。
English
Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning. Each of these comes with their own trade-offs based on available data, model size, compute cost, ease-of-use, and final quality with neither solution performing well across-the-board. In this article, we first describe ICL and fine-tuning paradigms in a way that highlights their natural connections. Based on these connections, we propose a new learning paradigm called FIAT that fuses the best of these paradigms together, enabling prompt-engineered instructions and chain-of-thought reasoning with the very largest models while also using similar methods to perform parameter updates on a modestly-sized LLM with parameter-efficient tuning. We evaluate FIAT's effectiveness on a variety of multilingual tasks and observe that FIAT performs better than both ICL and fine-tuning at scales ranging from 100-10,000 training examples. We hope that FIAT provides a practical way of harnessing the full potential of LLMs without needing to make a hard choice between learning paradigms.
PDF60December 15, 2024