训练方法如何影响视觉模型的利用?
How Do Training Methods Influence the Utilization of Vision Models?
October 18, 2024
作者: Paul Gavrikov, Shashank Agnihotri, Margret Keuper, Janis Keuper
cs.AI
摘要
并非所有可学习参数(例如权重)对神经网络的决策函数贡献相同。事实上,有时整个层的参数被重置为随机值对模型的决策几乎没有影响。我们重新审视早期研究,探讨架构和任务复杂性如何影响这一现象,并提出问题:这一现象是否也受我们训练模型的方式影响?我们对多个ImageNet-1k分类模型进行实验评估,探讨这一问题,保持架构和训练数据恒定,但变化训练流程。我们的研究结果显示,训练方法强烈影响哪些层对给定任务的决策函数至关重要。例如,改进的训练方案和自监督训练增加了早期层的重要性,同时显著地未充分利用更深层。相反,对抗训练等方法显示出相反的趋势。我们的初步结果扩展了先前的发现,提供了对神经网络内部机制更加细致的理解。
代码:https://github.com/paulgavrikov/layer_criticality
English
Not all learnable parameters (e.g., weights) contribute equally to a neural
network's decision function. In fact, entire layers' parameters can sometimes
be reset to random values with little to no impact on the model's decisions. We
revisit earlier studies that examined how architecture and task complexity
influence this phenomenon and ask: is this phenomenon also affected by how we
train the model? We conducted experimental evaluations on a diverse set of
ImageNet-1k classification models to explore this, keeping the architecture and
training data constant but varying the training pipeline. Our findings reveal
that the training method strongly influences which layers become critical to
the decision function for a given task. For example, improved training regimes
and self-supervised training increase the importance of early layers while
significantly under-utilizing deeper layers. In contrast, methods such as
adversarial training display an opposite trend. Our preliminary results extend
previous findings, offering a more nuanced understanding of the inner mechanics
of neural networks.
Code: https://github.com/paulgavrikov/layer_criticalitySummary
AI-Generated Summary