TART: Ein Plug-and-Play-Transformer-Modul für aufgabenunabhängiges Schließen

Zusammenfassung

Große Sprachmodelle (LLMs) zeigen Fähigkeiten zum In-Context-Lernen, die es demselben Modell ermöglichen, mehrere Aufgaben ohne aufgabenspezifisches Training auszuführen. Im Gegensatz dazu passen traditionelle Adaptionsansätze, wie das Feinabstimmen (Fine-Tuning), die zugrunde liegenden Modelle für jede spezifische Aufgabe an. In-Context-Lernen schneidet jedoch selbst bei denselben Beispielen durchweg schlechter ab als aufgabenspezifische Abstimmungsansätze. Während sich die meisten bestehenden Ansätze (z. B. Prompt Engineering) auf die gelernten Repräsentationen des LLMs konzentrieren, um diese Leistungslücke zu schließen, zeigt unsere Analyse tatsächlich, dass die Repräsentationen des LLMs ausreichend Informationen enthalten, um gute Vorhersagen zu treffen. Daher konzentrieren wir uns auf die Fähigkeiten des LLMs zum logischen Schlussfolgern und zeigen, dass diese Leistungslücke auf ihre Unfähigkeit zurückzuführen ist, einfache probabilistische Schlussfolgerungsaufgaben durchzuführen. Dies wirft eine interessante Frage auf: Sind LLMs tatsächlich in der Lage, aufgabenunabhängig zu lernen, wie man schlussfolgert? Wir beantworten dies mit Ja und schlagen TART vor, das die Fähigkeiten eines LLMs zum logischen Schlussfolgern generisch verbessert, indem es ein synthetisch trainiertes Transformer-basiertes Schlussfolgerungsmodul verwendet. TART trainiert dieses Schlussfolgerungsmodul aufgabenunabhängig nur mit synthetischen logistischen Regressionsaufgaben und kombiniert es mit einem beliebigen realweltlichen vortrainierten Modell, ohne zusätzliches Training. Mit einem einzigen Inferenzmodul verbessert TART die Leistung über verschiedene Modellfamilien (GPT-Neo, Pythia, BLOOM), Modellgrößen (100M - 6B), Aufgaben (14 NLP-Binärklassifikationsaufgaben) und sogar über verschiedene Modalitäten (Audio und Vision) hinweg. Darüber hinaus verbessert TART auf dem RAFT-Benchmark die Leistung von GPT-Neo (125M) so stark, dass es BLOOM (176B) übertrifft und nur 4 % hinter GPT-3 (175B) liegt. Unser Code und unsere Modelle sind unter https://github.com/HazyResearch/TART verfügbar.

English

Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. In contrast, traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. In-context learning, however, consistently underperforms task-specific tuning approaches even when presented with the same examples. While most existing approaches (e.g., prompt engineering) focus on the LLM's learned representations to patch this performance gap, our analysis actually reveal that LLM representations contain sufficient information to make good predictions. As such, we focus on the LLM's reasoning abilities and demonstrate that this performance gap exists due to their inability to perform simple probabilistic reasoning tasks. This raises an intriguing question: Are LLMs actually capable of learning how to reason in a task-agnostic manner? We answer this in the affirmative and propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module. TART trains this reasoning module in a task-agnostic manner using only synthetic logistic regression tasks and composes it with an arbitrary real-world pre-trained model without any additional training. With a single inference module, TART improves performance across different model families (GPT-Neo, Pythia, BLOOM), model sizes (100M - 6B), tasks (14 NLP binary classification tasks), and even across different modalities (audio and vision). Additionally, on the RAFT Benchmark, TART improves GPT-Neo (125M)'s performance such that it outperforms BLOOM (176B), and is within 4% of GPT-3 (175B). Our code and models are available at https://github.com/HazyResearch/TART .

TART: Ein Plug-and-Play-Transformer-Modul für aufgabenunabhängiges Schließen

TART: A plug-and-play Transformer module for task-agnostic reasoning

Zusammenfassung

Support