语言模型是弱学习器。
Language models are weak learners
June 25, 2023
作者: Hariharan Manikandan, Yiding Jiang, J Zico Kolter
cs.AI
摘要
在实际和理论机器学习中的一个核心概念是弱学习器,即在任何给定的数据分布上,能够实现比随机性能更好(即使是略微)的分类器。这些弱学习器构成了经典机器学习方法的实际基础,比如 boosting。在这项工作中,我们阐明了基于提示的大型语言模型可以有效地作为这种弱学习器。具体来说,我们展示了在应用于表格数据的 boosting 算法中将大型语言模型(LLM)作为弱学习器的用法。我们表明,通过提供(根据感兴趣的分布适当抽样的)表格数据样本的文本描述,LLM 可以生成样本摘要,作为分类的模板,并实现在这一任务上作为弱学习器的目标。我们将这些模型整合到 boosting 方法中,在某些情况下,可以利用LLM内部的知识来超越传统基于树的 boosting 方法。该模型在一些情境下优于少样本学习,甚至有时甚至胜过更复杂的微调程序,特别是对涉及少量数据点的任务。结果展示了基于提示的LLM不仅可以作为少样本学习器本身,还可以作为更大的机器学习流程的组成部分的潜力。
English
A central notion in practical and theoretical machine learning is that of a
weak learner, classifiers that achieve better-than-random
performance (on any given distribution over data), even by a small margin. Such
weak learners form the practical basis for canonical machine learning methods
such as boosting. In this work, we illustrate that prompt-based large language
models can operate effectively as said weak learners. Specifically, we
illustrate the use of a large language model (LLM) as a weak learner in a
boosting algorithm applied to tabular data. We show that by providing (properly
sampled according to the distribution of interest) text descriptions of tabular
data samples, LLMs can produce a summary of the samples that serves as a
template for classification and achieves the aim of acting as a weak learner on
this task. We incorporate these models into a boosting approach, which in some
settings can leverage the knowledge within the LLM to outperform traditional
tree-based boosting. The model outperforms both few-shot learning and
occasionally even more involved fine-tuning procedures, particularly for tasks
involving small numbers of data points. The results illustrate the potential
for prompt-based LLMs to function not just as few-shot learners themselves, but
as components of larger machine learning pipelines.