语言模型是弱学习器。

摘要

在实际和理论机器学习中的一个核心概念是弱学习器，即在任何给定的数据分布上，能够实现比随机性能更好（即使是略微）的分类器。这些弱学习器构成了经典机器学习方法的实际基础，比如 boosting。在这项工作中，我们阐明了基于提示的大型语言模型可以有效地作为这种弱学习器。具体来说，我们展示了在应用于表格数据的 boosting 算法中将大型语言模型（LLM）作为弱学习器的用法。我们表明，通过提供（根据感兴趣的分布适当抽样的）表格数据样本的文本描述，LLM 可以生成样本摘要，作为分类的模板，并实现在这一任务上作为弱学习器的目标。我们将这些模型整合到 boosting 方法中，在某些情况下，可以利用LLM内部的知识来超越传统基于树的 boosting 方法。该模型在一些情境下优于少样本学习，甚至有时甚至胜过更复杂的微调程序，特别是对涉及少量数据点的任务。结果展示了基于提示的LLM不仅可以作为少样本学习器本身，还可以作为更大的机器学习流程的组成部分的潜力。

English

A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.

语言模型是弱学习器。

Language models are weak learners

摘要

Support