ChatPaper.aiChatPaper

神经丛林:预训练权重周围密布多样化任务专家

Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

March 12, 2026
作者: Yulu Gan, Phillip Isola
cs.AI

摘要

预训练产生的学习参数向量通常被视为后续迭代适应的起点。在本研究中,我们提出将预训练结果视作参数向量上的概率分布,其支撑集已包含任务特定的专家解。我们证明在小型模型中,此类专家解仅占据该分布体积的极小部分,需依赖梯度下降等结构化优化方法才能发现。相反,在经过充分预训练的大型模型中,任务专家的密度显著增加,使得多样化、能提升任务性能的专家解密集分布于预训练权重邻域内。基于此视角,我们探索了一种完全并行的后训练方法:随机采样N个参数扰动,选取最优的K个扰动,通过多数投票集成预测。尽管方法简单,该策略在当代大规模模型中与PPO、GRPO和ES等标准后训练方法相比仍具竞争力。
English
Pretraining produces a learned parameter vector that is typically treated as a starting point for further iterative adaptation. In this work, we instead view the outcome of pretraining as a distribution over parameter vectors, whose support already contains task-specific experts. We show that in small models such expert solutions occupy a negligible fraction of the volume of this distribution, making their discovery reliant on structured optimization methods such as gradient descent. In contrast, in large, well-pretrained models the density of task-experts increases dramatically, so that diverse, task-improving specialists populate a substantial fraction of the neighborhood around the pretrained weights. Motivated by this perspective, we explore a simple, fully parallel post-training method that samples N parameter perturbations at random, selects the top K, and ensembles predictions via majority vote. Despite its simplicity, this approach is competitive with standard post-training methods such as PPO, GRPO, and ES for contemporary large-scale models.
PDF32March 15, 2026