ChatPaper.aiChatPaper

通过微调和模型合并跟踪通用特征

Tracking Universal Features Through Fine-Tuning and Model Merging

October 16, 2024
作者: Niels Horn, Desmond Elliott
cs.AI

摘要

我们研究了特征如何在针对不同文本领域微调的模型中出现、消失和持续。更具体地说,我们从一个基础的单层Transformer语言模型开始,该模型经过BabyLM语料库和The Stack中的Python代码的组合训练。然后,将此基础模型调整为两个新的文本领域:TinyStories 和 Lua 编程语言;然后使用球面线性插值合并这两个模型。我们的探索旨在深入了解特征在典型的迁移学习场景中的稳定性和转化,使用小规模模型和稀疏自编码器。
English
We study how features emerge, disappear, and persist across models fine-tuned on different domains of text. More specifically, we start from a base one-layer Transformer language model that is trained on a combination of the BabyLM corpus, and a collection of Python code from The Stack. This base model is adapted to two new domains of text: TinyStories, and the Lua programming language, respectively; and then these two models are merged using these two models using spherical linear interpolation. Our exploration aims to provide deeper insights into the stability and transformation of features across typical transfer-learning scenarios using small-scale models and sparse auto-encoders.

Summary

AI-Generated Summary

PDF52November 16, 2024