語言模型是弱學習器。

摘要

在實際和理論機器學習中的一個核心概念是弱學習器，即在任何給定的數據分佈上實現優於隨機性能的分類器，即使只有一點點。這些弱學習器形成了經典機器學習方法的實際基礎，如 boosting。在這項工作中，我們說明基於提示的大型語言模型可以有效地作為這種弱學習器。具體來說，我們展示了在應用於表格數據的 boosting 算法中，使用大型語言模型（LLM）作為弱學習器。我們展示了通過提供（根據感興趣的分佈正確抽樣的）表格數據樣本的文本描述，LLMs 可以生成樣本摘要，作為分類的模板並實現作為該任務上的弱學習器的目的。我們將這些模型融入 boosting 方法，某些情況下可以利用 LLM 內部的知識來優於傳統基於樹的 boosting。該模型在某些情況下優於少樣本學習，甚至有時甚至優於更複雜的微調程序，特別是對涉及少量數據點的任務。結果說明了基於提示的 LLMs 不僅可以作為少樣本學習者本身，還可以作為更大機器學習管道的組件。

English

A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.

語言模型是弱學習器。

Language models are weak learners

摘要

Support