ChatPaper.aiChatPaper

语言模型是弱学习器。

Language models are weak learners

June 25, 2023
作者: Hariharan Manikandan, Yiding Jiang, J Zico Kolter
cs.AI

摘要

在实际和理论机器学习中的一个核心概念是弱学习器,即在任何给定的数据分布上,能够实现比随机性能更好(即使是略微)的分类器。这些弱学习器构成了经典机器学习方法的实际基础,比如 boosting。在这项工作中,我们阐明了基于提示的大型语言模型可以有效地作为这种弱学习器。具体来说,我们展示了在应用于表格数据的 boosting 算法中将大型语言模型(LLM)作为弱学习器的用法。我们表明,通过提供(根据感兴趣的分布适当抽样的)表格数据样本的文本描述,LLM 可以生成样本摘要,作为分类的模板,并实现在这一任务上作为弱学习器的目标。我们将这些模型整合到 boosting 方法中,在某些情况下,可以利用LLM内部的知识来超越传统基于树的 boosting 方法。该模型在一些情境下优于少样本学习,甚至有时甚至胜过更复杂的微调程序,特别是对涉及少量数据点的任务。结果展示了基于提示的LLM不仅可以作为少样本学习器本身,还可以作为更大的机器学习流程的组成部分的潜力。
English
A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
PDF100December 15, 2024