Mach es SING: Analyse semantischer Invarianten in Klassifikatoren

Zusammenfassung

Alle Klassifikatoren, einschließlich modernster Bildverarbeitungsmodelle, besitzen Invarianten, die teilweise in der Geometrie ihrer linearen Abbildungen begründet sind. Diese Invarianten, die im Nullraum des Klassifikators liegen, induzieren äquivalente Eingabemengen, die auf identische Ausgaben abgebildet werden. Der semantische Gehalt dieser Invarianten bleibt vage, da bestehende Ansätze Schwierigkeiten haben, menscheninterpretierbare Informationen bereitzustellen. Um diese Lücke zu schließen, stellen wir SING (Semantic Interpretation of the Null-space Geometry) vor, eine Methode, die in Bezug auf das Netzwerk äquivalente Bilder konstruiert und den verfügbaren Variationen semantische Interpretationen zuweist. Wir verwenden eine Abbildung von Netzwerkfeatures zu multimodalen Vision-Language-Modellen. Dies ermöglicht es uns, natürliche Sprachbeschreibungen und visuelle Beispiele der induzierten semantischen Verschiebungen zu erhalten. SING kann auf ein einzelnes Bild angewendet werden, um lokale Invarianten aufzudecken, oder auf Bildersets, was eine breite statistische Analyse auf Klassen- und Modellebene ermöglicht. So zeigt unsere Methode beispielsweise, dass ResNet50 relevante semantische Attribute in den Nullraum "leakt", während DinoViT, ein mit selbstüberwachtem DINO vortrainierter ViT, überlegen darin ist, Klassensemantik über den invarianten Raum hinweg beizubehalten.

English

All classifiers, including state-of-the-art vision models, possess invariants, partially rooted in the geometry of their linear mappings. These invariants, which reside in the null-space of the classifier, induce equivalent sets of inputs that map to identical outputs. The semantic content of these invariants remains vague, as existing approaches struggle to provide human-interpretable information. To address this gap, we present Semantic Interpretation of the Null-space Geometry (SING), a method that constructs equivalent images, with respect to the network, and assigns semantic interpretations to the available variations. We use a mapping from network features to multi-modal vision language models. This allows us to obtain natural language descriptions and visual examples of the induced semantic shifts. SING can be applied to a single image, uncovering local invariants, or to sets of images, allowing a breadth of statistical analysis at the class and model levels. For example, our method reveals that ResNet50 leaks relevant semantic attributes to the null space, whereas DinoViT, a ViT pretrained with self-supervised DINO, is superior in maintaining class semantics across the invariant space.

Mach es SING: Analyse semantischer Invarianten in Klassifikatoren

Make it SING: Analyzing Semantic Invariants in Classifiers

Zusammenfassung

Support