語言模型是弱學習器。
Language models are weak learners
June 25, 2023
作者: Hariharan Manikandan, Yiding Jiang, J Zico Kolter
cs.AI
摘要
在實際和理論機器學習中的一個核心概念是弱學習器,即在任何給定的數據分佈上實現優於隨機性能的分類器,即使只有一點點。這些弱學習器形成了經典機器學習方法的實際基礎,如 boosting。在這項工作中,我們說明基於提示的大型語言模型可以有效地作為這種弱學習器。具體來說,我們展示了在應用於表格數據的 boosting 算法中,使用大型語言模型(LLM)作為弱學習器。我們展示了通過提供(根據感興趣的分佈正確抽樣的)表格數據樣本的文本描述,LLMs 可以生成樣本摘要,作為分類的模板並實現作為該任務上的弱學習器的目的。我們將這些模型融入 boosting 方法,某些情況下可以利用 LLM 內部的知識來優於傳統基於樹的 boosting。該模型在某些情況下優於少樣本學習,甚至有時甚至優於更複雜的微調程序,特別是對涉及少量數據點的任務。結果說明了基於提示的 LLMs 不僅可以作為少樣本學習者本身,還可以作為更大機器學習管道的組件。
English
A central notion in practical and theoretical machine learning is that of a
weak learner, classifiers that achieve better-than-random
performance (on any given distribution over data), even by a small margin. Such
weak learners form the practical basis for canonical machine learning methods
such as boosting. In this work, we illustrate that prompt-based large language
models can operate effectively as said weak learners. Specifically, we
illustrate the use of a large language model (LLM) as a weak learner in a
boosting algorithm applied to tabular data. We show that by providing (properly
sampled according to the distribution of interest) text descriptions of tabular
data samples, LLMs can produce a summary of the samples that serves as a
template for classification and achieves the aim of acting as a weak learner on
this task. We incorporate these models into a boosting approach, which in some
settings can leverage the knowledge within the LLM to outperform traditional
tree-based boosting. The model outperforms both few-shot learning and
occasionally even more involved fine-tuning procedures, particularly for tasks
involving small numbers of data points. The results illustrate the potential
for prompt-based LLMs to function not just as few-shot learners themselves, but
as components of larger machine learning pipelines.