언어 모델은 약한 학습자(weak learner)입니다.

초록

실용적 및 이론적 기계 학습에서의 핵심 개념 중 하나는 약한 학습자(weak learner)로, 이는 임의의 데이터 분포에서도 무작위보다 나은 성능을 달성하는 분류기를 의미한다. 이러한 약한 학습자는 부스팅(boosting)과 같은 표준 기계 학습 방법의 실질적인 기반을 형성한다. 본 연구에서는 프롬프트 기반 대형 언어 모델(LLM)이 이러한 약한 학습자로 효과적으로 작동할 수 있음을 보여준다. 구체적으로, 우리는 표 형식의 데이터에 적용된 부스팅 알고리즘에서 LLM을 약한 학습자로 사용하는 방법을 설명한다. 관심 있는 분포에 따라 적절히 샘플링된 표 형식 데이터 샘플의 텍스트 설명을 제공함으로써, LLM은 분류를 위한 템플릿 역할을 하는 샘플 요약을 생성할 수 있으며, 이는 해당 작업에서 약한 학습자로 작동하는 목적을 달성한다. 우리는 이러한 모델을 부스팅 접근법에 통합하여, 일부 설정에서 LLM 내부의 지식을 활용하여 전통적인 트리 기반 부스팅을 능가할 수 있음을 보여준다. 이 모델은 소량의 데이터 포인트를 포함하는 작업에서 특히 적은 샷 학습(few-shot learning)을 능가하며, 때로는 더 복잡한 미세 조정(fine-tuning) 절차보다도 우수한 성능을 보인다. 이러한 결과는 프롬프트 기반 LLM이 단순히 적은 샷 학습자로만 기능하는 것이 아니라, 더 큰 기계 학습 파이프라인의 구성 요소로 작용할 수 있는 잠재력을 보여준다.

English

A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting. In this work, we illustrate that prompt-based large language models can operate effectively as said weak learners. Specifically, we illustrate the use of a large language model (LLM) as a weak learner in a boosting algorithm applied to tabular data. We show that by providing (properly sampled according to the distribution of interest) text descriptions of tabular data samples, LLMs can produce a summary of the samples that serves as a template for classification and achieves the aim of acting as a weak learner on this task. We incorporate these models into a boosting approach, which in some settings can leverage the knowledge within the LLM to outperform traditional tree-based boosting. The model outperforms both few-shot learning and occasionally even more involved fine-tuning procedures, particularly for tasks involving small numbers of data points. The results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.

언어 모델은 약한 학습자(weak learner)입니다.

Language models are weak learners

초록

Support