TensorLens: End-to-End Transformer-Analyse via Hogere-Orde-Aandachtstensoren

Samenvatting

Aandachtmatrices vormen de basis van transformer-onderzoek en ondersteunen een breed scala aan toepassingen, waaronder interpreteerbaarheid, visualisatie, manipulatie en distillatie. Toch richten de meeste bestaande analyses zich op individuele aandachtskoppen of lagen, waarbij het globale gedrag van het model buiten beschouwing wordt gelaten. Hoewel eerdere inspanningen aandachtformuleringen hebben uitgebreid over meerdere koppen via gemiddelden en matrixvermenigvuldigingen, of componenten zoals normalisatie en FFN's hebben geïntegreerd, ontbreekt het nog steeds aan een uniforme en complete representatie die alle transformerblokken omvat. Wij voorzien in deze leemte door TensorLens te introduceren, een nieuwe formulering die de volledige transformer vat als een enkele, invoerafhankelijke lineaire operator uitgedrukt door een hogere-orde aandacht-interactietensor. Deze tensor codeert gezamenlijk aandacht, FFN's, activaties, normalisaties en residuele verbindingen, en biedt een theoretisch coherente en expressieve lineaire representatie van de modelberekening. TensorLens is theoretisch onderbouwd en onze empirische validatie toont aan dat het rijkere representaties oplevert dan eerdere aandacht-aggregatiemethoden. Onze experimenten tonen aan dat de aandachtstensor kan dienen als een krachtige basis voor het ontwikkelen van tools gericht op interpreteerbaarheid en modelbegrip. Onze code is als bijlage toegevoegd.

English

Attention matrices are fundamental to transformer research, supporting a broad range of applications including interpretability, visualization, manipulation, and distillation. Yet, most existing analyses focus on individual attention heads or layers, failing to account for the model's global behavior. While prior efforts have extended attention formulations across multiple heads via averaging and matrix multiplications or incorporated components such as normalization and FFNs, a unified and complete representation that encapsulates all transformer blocks is still lacking. We address this gap by introducing TensorLens, a novel formulation that captures the entire transformer as a single, input-dependent linear operator expressed through a high-order attention-interaction tensor. This tensor jointly encodes attention, FFNs, activations, normalizations, and residual connections, offering a theoretically coherent and expressive linear representation of the model's computation. TensorLens is theoretically grounded and our empirical validation shows that it yields richer representations than previous attention-aggregation methods. Our experiments demonstrate that the attention tensor can serve as a powerful foundation for developing tools aimed at interpretability and model understanding. Our code is attached as a supplementary.

TensorLens: End-to-End Transformer-Analyse via Hogere-Orde-Aandachtstensoren

TensorLens: End-to-End Transformer Analysis via High-Order Attention Tensors

Samenvatting

Support