训练方法如何影响视觉模型的利用？

摘要

并非所有可学习参数（例如权重）对神经网络的决策函数贡献相同。事实上，有时整个层的参数被重置为随机值对模型的决策几乎没有影响。我们重新审视早期研究，探讨架构和任务复杂性如何影响这一现象，并提出问题：这一现象是否也受我们训练模型的方式影响？我们对多个ImageNet-1k分类模型进行实验评估，探讨这一问题，保持架构和训练数据恒定，但变化训练流程。我们的研究结果显示，训练方法强烈影响哪些层对给定任务的决策函数至关重要。例如，改进的训练方案和自监督训练增加了早期层的重要性，同时显著地未充分利用更深层。相反，对抗训练等方法显示出相反的趋势。我们的初步结果扩展了先前的发现，提供了对神经网络内部机制更加细致的理解。代码：https://github.com/paulgavrikov/layer_criticality

English

Not all learnable parameters (e.g., weights) contribute equally to a neural network's decision function. In fact, entire layers' parameters can sometimes be reset to random values with little to no impact on the model's decisions. We revisit earlier studies that examined how architecture and task complexity influence this phenomenon and ask: is this phenomenon also affected by how we train the model? We conducted experimental evaluations on a diverse set of ImageNet-1k classification models to explore this, keeping the architecture and training data constant but varying the training pipeline. Our findings reveal that the training method strongly influences which layers become critical to the decision function for a given task. For example, improved training regimes and self-supervised training increase the importance of early layers while significantly under-utilizing deeper layers. In contrast, methods such as adversarial training display an opposite trend. Our preliminary results extend previous findings, offering a more nuanced understanding of the inner mechanics of neural networks. Code: https://github.com/paulgavrikov/layer_criticality

训练方法如何影响视觉模型的利用？

How Do Training Methods Influence the Utilization of Vision Models?

摘要

Support