從頭開始的合成數據：針對語言模型的通用指令調整

摘要

我們介紹了一種名為廣義指令調整（稱為GLAN）的通用且可擴展的方法，用於大型語言模型（LLMs）的指令調整。與先前依賴種子示例或現有數據集來構建指令調整數據的工作不同，GLAN專門利用人類知識和能力的預先策劃分類作為輸入，並在所有學科中生成大規模合成指令數據。具體來說，受人類教育系統中的系統結構啟發，我們通過半自動方式利用LLMs分解人類知識和能力到各種領域、子領域，最終到獨特學科，構建了這個分類法。隨後，我們為每個學科生成了一個全面的科目列表，並繼續設計了針對每個科目的課程大綱，同樣利用LLMs。通過大綱中每個課堂會話中詳細列出的細粒度關鍵概念，我們能夠生成涵蓋人類知識和技能整個範譜的多樣指令。對大型語言模型（例如Mistral）的廣泛實驗表明，GLAN在多個維度上表現出色，從數學推理、編碼、學術考試、邏輯推理到一般指令遵循，而無需使用這些任務的特定訓練數據。此外，GLAN允許輕鬆定制，只需將新節點納入我們的分類法，即可添加新領域或技能。

English

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs). Unlike prior work that relies on seed examples or existing datasets to construct instruction tuning data, GLAN exclusively utilizes a pre-curated taxonomy of human knowledge and capabilities as input and generates large-scale synthetic instruction data across all disciplines. Specifically, inspired by the systematic structure in human education system, we build the taxonomy by decomposing human knowledge and capabilities to various fields, sub-fields and ultimately, distinct disciplines semi-automatically, facilitated by LLMs. Subsequently, we generate a comprehensive list of subjects for every discipline and proceed to design a syllabus tailored to each subject, again utilizing LLMs. With the fine-grained key concepts detailed in every class session of the syllabus, we are able to generate diverse instructions with a broad coverage across the entire spectrum of human knowledge and skills. Extensive experiments on large language models (e.g., Mistral) demonstrate that GLAN excels in multiple dimensions from mathematical reasoning, coding, academic exams, logical reasoning to general instruction following without using task-specific training data of these tasks. In addition, GLAN allows for easy customization and new fields or skills can be added by simply incorporating a new node into our taxonomy.

從頭開始的合成數據：針對語言模型的通用指令調整

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

摘要

Support