Orion-MSP: Multi-schaal Sparse Attention voor In-Context Leren met Tabelgegevens

Samenvatting

Tabelgegevens blijven het overheersende formaat voor praktijktoepassingen. Toch blijft de ontwikkeling van effectieve neurale modellen voor tabelgegevens een uitdaging vanwege heterogene functietypes en complexe interacties die op meerdere schalen optreden. Recente vooruitgang in in-context leren (ICL) voor tabellen, zoals TabPFN en TabICL, heeft state-of-the-art prestaties bereikt die vergelijkbaar zijn met gradient-boosted trees (GBTs) zonder taakspecifieke fine-tuning. Huidige architecturen vertonen echter belangrijke beperkingen: (1) verwerking van kenmerken op één schaal die hiërarchische afhankelijkheden over het hoofd ziet, (2) dichte aandacht met kwadratische schaling in tabelbreedte, en (3) strikt sequentiële verwerking van componenten die iteratieve verfijning van representaties en communicatie tussen componenten verhindert. Om deze uitdagingen aan te pakken, introduceren wij Orion-MSP, een tabellarisch ICL-architectuur met drie belangrijke innovaties: (1) multi-schaalverwerking om hiërarchische kenmerkinteracties vast te leggen; (2) blokschaarse aandacht die venster-, globale- en willekeurige patronen combineert voor schaalbare efficiëntie en connectiviteit over lange afstand; en (3) een Perceiver-stijl geheugen dat veilige bidirectionele informatiestroom tussen componenten mogelijk maakt. In diverse benchmarks evenaart of overtreft Orion-MSP de state-of-the-art prestaties, terwijl het effectief schaalt naar hoogdimensionale tabellen, en stelt zo een nieuwe standaard voor efficiënt in-context leren voor tabellen. Het model is openbaar beschikbaar op https://github.com/Lexsi-Labs/Orion-MSP.

English

Tabular data remain the predominant format for real-world applications. Yet, developing effective neural models for tabular data remains challenging due to heterogeneous feature types and complex interactions occurring at multiple scales. Recent advances in tabular in-context learning (ICL), such as TabPFN and TabICL, have achieved state-of-the-art performance comparable to gradient-boosted trees (GBTs) without task-specific fine-tuning. However, current architectures exhibit key limitations: (1) single-scale feature processing that overlooks hierarchical dependencies, (2) dense attention with quadratic scaling in table width, and (3) strictly sequential component processing that prevents iterative representation refinement and cross-component communication. To address these challenges, we introduce Orion-MSP, a tabular ICL architecture featuring three key innovations: (1) multi-scale processing to capture hierarchical feature interactions; (2) block-sparse attention combining windowed, global, and random patterns for scalable efficiency and long-range connectivity; and (3) a Perceiver-style memory enabling safe bidirectional information flow across components. Across diverse benchmarks, Orion-MSP matches or surpasses state-of-the-art performance while scaling effectively to high-dimensional tables, establishing a new standard for efficient tabular in-context learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-MSP .

Orion-MSP: Multi-schaal Sparse Attention voor In-Context Leren met Tabelgegevens

Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning

Samenvatting

Support