透過複雜度的觀點理解視覺特徵依賴
Understanding Visual Feature Reliance through the Lens of Complexity
July 8, 2024
作者: Thomas Fel, Louis Bethune, Andrew Kyle Lampinen, Thomas Serre, Katherine Hermann
cs.AI
摘要
最近的研究表明,深度學習模型對於偏好較簡單特徵的歸納偏差可能是快捷學習的來源之一。然而,對於模型學習的眾多特徵的複雜性理解一直受到限制。在這項研究中,我們引入了一個新的度量標準,用於量化特徵的複雜性,基於V-信息,並捕捉一個特徵是否需要複雜的計算轉換才能被提取出來。利用這個V-信息度量標準,我們分析了從標準ImageNet訓練的視覺模型中提取的10,000個特徵的複雜性,這些特徵被表示為倒數第二層中的方向。我們的研究涉及四個關鍵問題:首先,我們探討特徵在複雜性方面的外觀,並發現模型中存在各種從簡單到複雜的特徵。其次,我們探討特徵在訓練過程中是何時被學習的。我們發現,在訓練初期較為簡單的特徵佔主導地位,而較為複雜的特徵逐漸出現。第三,我們調查簡單和複雜特徵在網絡中流動的位置,並發現較為簡單的特徵通過剩餘連接方式繞過視覺層次結構。第四,我們探索特徵複雜性與它們在驅動網絡決策中的重要性之間的聯繫。我們發現複雜特徵往往不太重要。令人驚訝的是,重要特徵在訓練過程中更早地變得可訪問,就像一個沉澱過程,使模型能夠建立在這些基礎元素之上。
English
Recent studies suggest that deep learning models inductive bias towards
favoring simpler features may be one of the sources of shortcut learning. Yet,
there has been limited focus on understanding the complexity of the myriad
features that models learn. In this work, we introduce a new metric for
quantifying feature complexity, based on V-information and
capturing whether a feature requires complex computational transformations to
be extracted. Using this V-information metric, we analyze the
complexities of 10,000 features, represented as directions in the penultimate
layer, that were extracted from a standard ImageNet-trained vision model. Our
study addresses four key questions: First, we ask what features look like as a
function of complexity and find a spectrum of simple to complex features
present within the model. Second, we ask when features are learned during
training. We find that simpler features dominate early in training, and more
complex features emerge gradually. Third, we investigate where within the
network simple and complex features flow, and find that simpler features tend
to bypass the visual hierarchy via residual connections. Fourth, we explore the
connection between features complexity and their importance in driving the
networks decision. We find that complex features tend to be less important.
Surprisingly, important features become accessible at earlier layers during
training, like a sedimentation process, allowing the model to build upon these
foundational elements.Summary
AI-Generated Summary