ChatPaper.aiChatPaper

超越线性瓶颈:基于样条的知识蒸馏在多元文化艺术风格分类中的应用

Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification

July 31, 2025
作者: Abdellah Zakaria Sellam, Salah Eddine Bekhouche, Cosimo Distante, Abdelmalik Taleb-Ahmed
cs.AI

摘要

艺术风格分类在计算美学领域仍是一项艰巨挑战,这主要源于专家标注数据集的稀缺以及风格元素间错综复杂、往往非线性的相互作用。尽管近期的双教师自监督框架降低了对标注数据的依赖,但其线性投影层和局部聚焦难以建模全局构图上下文及复杂的风格特征交互。我们通过将传统的多层感知机(MLP)投影和预测头替换为Kolmogorov-Arnold网络(KANs),增强了双教师知识蒸馏框架以应对这些局限。该方法保留了来自两个教师网络的互补指导:一个侧重于局部纹理和笔触模式,另一个捕捉更广泛的风格层次结构,同时利用KANs基于样条的激活函数,以数学精度建模非线性特征关联。在WikiArt和Pandora18k数据集上的实验表明,我们的方法在Top-1准确率上超越了基础双教师架构。研究结果凸显了KANs在解构复杂风格流形中的重要性,相较于MLP投影,其带来了更好的线性探针准确率。
English
Art style classification remains a formidable challenge in computational aesthetics due to the scarcity of expertly labeled datasets and the intricate, often nonlinear interplay of stylistic elements. While recent dual-teacher self-supervised frameworks reduce reliance on labeled data, their linear projection layers and localized focus struggle to model global compositional context and complex style-feature interactions. We enhance the dual-teacher knowledge distillation framework to address these limitations by replacing conventional MLP projection and prediction heads with Kolmogorov-Arnold Networks (KANs). Our approach retains complementary guidance from two teacher networks, one emphasizing localized texture and brushstroke patterns, the other capturing broader stylistic hierarchies while leveraging KANs' spline-based activations to model nonlinear feature correlations with mathematical precision. Experiments on WikiArt and Pandora18k demonstrate that our approach outperforms the base dual teacher architecture in Top-1 accuracy. Our findings highlight the importance of KANs in disentangling complex style manifolds, leading to better linear probe accuracy than MLP projections.
PDF32August 1, 2025