CLASS-IT：面向BabyLMs的对话式与讲座对齐小规模指令微调框架

摘要

本研究探讨小规模语言模型能否从指令微调中获益。我们对比了对话型与问答型指令微调数据集在合并式与序列式课程学习策略下的效果，实验基于1亿和1.4亿参数的仅解码器模型。评估涵盖微调（SuperGLUE）与零样本（BLiMP、EWoK、WUGs、实体追踪及心理语言学相关性）双重场景。结果表明：指令微调在微调场景中能带来虽小但稳定的性能提升，其中序列课程策略优于合并数据；然而这种改进不能稳定迁移至零样本任务，暗示了交互导向的适应性与广泛语言泛化能力之间存在权衡。这些发现既揭示了将人类启发式学习策略应用于低资源语言模型的潜力，也凸显了其局限性，同时为在生态化训练限制下通过混合式课程学习提升泛化能力指明了方向。

English

This work investigates whether small-scale LMs can benefit from instruction tuning. We compare conversational and question-answering instruction tuning datasets, applied either in a merged or sequential curriculum, using decoder-only models with 100M and 140M parameters. Evaluation spans both fine-tuning (SuperGLUE) and zero-shot (BLiMP, EWoK, WUGs, entity tracking, and psycholinguistic correlation) settings. Results show that instruction tuning yields small but consistent gains in fine-tuning scenarios, with sequential curricula outperforming merged data; however, improvements do not consistently transfer to zero-shot tasks, suggesting a trade-off between interaction-focused adaptation and broad linguistic generalization. These results highlight both the potential and the constraints of adapting human-inspired learning strategies to low-resource LMs, and point toward hybrid, curriculum-based approaches for enhancing generalization under ecological training limits.

CLASS-IT：面向BabyLMs的对话式与讲座对齐小规模指令微调框架

CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs

摘要

Support