線形ボトルネックを超えて：文化的に多様な芸術スタイル分類のためのスプラインに基づく知識蒸留

要旨

芸術スタイルの分類は、専門家によるラベル付きデータセットの不足や、スタイル要素の複雑でしばしば非線形な相互作用のため、計算美学において依然として大きな課題である。最近のデュアルティーチャー自己教師ありフレームワークはラベル付きデータへの依存を軽減するが、その線形射影層と局所的な焦点は、グローバルな構成的文脈や複雑なスタイル特徴の相互作用をモデル化するのに苦戦している。我々は、従来のMLP射影および予測ヘッドをKolmogorov-Arnold Networks（KANs）に置き換えることで、これらの制限に対処するためにデュアルティーチャー知識蒸留フレームワークを強化する。我々のアプローチは、2つのティーチャーネットワークからの補完的なガイダンスを保持し、一方は局所的なテクスチャや筆致パターンを強調し、他方は広範なスタイル的階層を捉えながら、KANsのスプラインに基づく活性化を活用して非線形特徴相関を数学的精度でモデル化する。WikiArtおよびPandora18kでの実験により、我々のアプローチがベースのデュアルティーチャーアーキテクチャをTop-1精度で上回ることが示された。我々の知見は、複雑なスタイル多様体を解きほぐす上でのKANsの重要性を強調し、MLP射影よりも優れた線形プローブ精度をもたらすことを示している。

English

Art style classification remains a formidable challenge in computational aesthetics due to the scarcity of expertly labeled datasets and the intricate, often nonlinear interplay of stylistic elements. While recent dual-teacher self-supervised frameworks reduce reliance on labeled data, their linear projection layers and localized focus struggle to model global compositional context and complex style-feature interactions. We enhance the dual-teacher knowledge distillation framework to address these limitations by replacing conventional MLP projection and prediction heads with Kolmogorov-Arnold Networks (KANs). Our approach retains complementary guidance from two teacher networks, one emphasizing localized texture and brushstroke patterns, the other capturing broader stylistic hierarchies while leveraging KANs' spline-based activations to model nonlinear feature correlations with mathematical precision. Experiments on WikiArt and Pandora18k demonstrate that our approach outperforms the base dual teacher architecture in Top-1 accuracy. Our findings highlight the importance of KANs in disentangling complex style manifolds, leading to better linear probe accuracy than MLP projections.

線形ボトルネックを超えて：文化的に多様な芸術スタイル分類のためのスプラインに基づく知識蒸留

Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification

要旨

Support