超越线性瓶颈:基于样条的知识蒸馏在多元文化艺术风格分类中的应用
Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification
July 31, 2025
作者: Abdellah Zakaria Sellam, Salah Eddine Bekhouche, Cosimo Distante, Abdelmalik Taleb-Ahmed
cs.AI
摘要
藝術風格分類在計算美學領域仍是一項艱鉅挑戰,這主要歸因於專家標註數據集的稀缺性以及風格元素之間複雜且往往非線性的相互作用。儘管近期的雙教師自監督框架減少了對標註數據的依賴,但其線性投影層和局部聚焦難以建模全局構圖上下文及複雜的風格特徵交互。為克服這些限制,我們對雙教師知識蒸餾框架進行了改進,以Kolmogorov-Arnold網絡(KANs)替代傳統的多層感知機(MLP)投影與預測頭。我們的方法保留了來自兩個教師網絡的互補指導:一個側重於局部紋理與筆觸模式,另一個則捕捉更廣泛的風格層次結構,同時利用KANs基於樣條的激活函數,以數學精確度建模非線性特徵關聯。在WikiArt和Pandora18k數據集上的實驗表明,我們的方法在Top-1準確率上超越了基礎雙教師架構。研究結果凸顯了KANs在解構複雜風格流形中的重要性,從而實現了比MLP投影更優的線性探測準確率。
English
Art style classification remains a formidable challenge in computational
aesthetics due to the scarcity of expertly labeled datasets and the intricate,
often nonlinear interplay of stylistic elements. While recent dual-teacher
self-supervised frameworks reduce reliance on labeled data, their linear
projection layers and localized focus struggle to model global compositional
context and complex style-feature interactions. We enhance the dual-teacher
knowledge distillation framework to address these limitations by replacing
conventional MLP projection and prediction heads with Kolmogorov-Arnold
Networks (KANs). Our approach retains complementary guidance from two teacher
networks, one emphasizing localized texture and brushstroke patterns, the other
capturing broader stylistic hierarchies while leveraging KANs' spline-based
activations to model nonlinear feature correlations with mathematical
precision. Experiments on WikiArt and Pandora18k demonstrate that our approach
outperforms the base dual teacher architecture in Top-1 accuracy. Our findings
highlight the importance of KANs in disentangling complex style manifolds,
leading to better linear probe accuracy than MLP projections.