CLASS-IT：面向BabyLMs的对话式与讲座对齐小规模指令微调框架

摘要

本研究旨在探討小規模語言模型是否能從指令調優中獲益。我們比較了對話型與問答型指令調優數據集，分別採用合併式與序列式課程學習策略，並在1億和1.4億參數的解碼器專用模型上進行實驗。評估範圍涵蓋微調（SuperGLUE）與零樣本（BLiMP、EWoK、WUGs、實體追蹤及心理語言學相關性）兩種設定。結果顯示：在微調場景中，指令調優能帶來雖小但穩定的性能提升，且序列式課程學習優於合併數據策略；然而這些改進並不能穩定遷移至零樣本任務，表明交互導向的適應性與廣泛語言泛化能力之間存在權衡。這些發現既揭示了將人類啟發式學習策略應用於低資源語言模型的潛力，也凸顯其局限性，同時為在生態化訓練限制下通過混合式課程學習方法增強泛化能力指明了方向。

English

This work investigates whether small-scale LMs can benefit from instruction tuning. We compare conversational and question-answering instruction tuning datasets, applied either in a merged or sequential curriculum, using decoder-only models with 100M and 140M parameters. Evaluation spans both fine-tuning (SuperGLUE) and zero-shot (BLiMP, EWoK, WUGs, entity tracking, and psycholinguistic correlation) settings. Results show that instruction tuning yields small but consistent gains in fine-tuning scenarios, with sequential curricula outperforming merged data; however, improvements do not consistently transfer to zero-shot tasks, suggesting a trade-off between interaction-focused adaptation and broad linguistic generalization. These results highlight both the potential and the constraints of adapting human-inspired learning strategies to low-resource LMs, and point toward hybrid, curriculum-based approaches for enhancing generalization under ecological training limits.

CLASS-IT：面向BabyLMs的对话式与讲座对齐小规模指令微调框架

CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs

摘要

Support