圖馬巴:朝向利用狀態空間模型在圖上學習
Graph Mamba: Towards Learning on Graphs with State Space Models
February 13, 2024
作者: Ali Behrouz, Farnoosh Hashemi
cs.AI
摘要
圖神經網絡(GNNs)在圖表示學習中展現了潛在的應用價值。大多數GNNs定義了一種局部消息傳遞機制,通過堆疊多個層在圖上傳播信息。然而,這些方法已知存在兩個主要限制:過度壓縮和對長程依賴性的捕捉不足。最近,圖形轉換器(GTs)作為消息傳遞神經網絡(MPNNs)的一個強大替代方案崛起。然而,GTs具有二次計算成本,缺乏對圖結構的歸納偏差,並依賴於複雜的位置/結構編碼(SE/PE)。在本文中,我們展示了儘管在實踐中,轉換器、複雜消息傳遞和SE/PE對於良好性能是足夠的,但並非必要。受到最近狀態空間模型(SSMs)如Mamba的成功啟發,我們提出了圖Mamba網絡(GMNs),這是一個基於選擇性SSMs的新型GNNs通用框架。我們討論並對採用SSMs到圖結構數據時遇到的新挑戰進行分類,提出了設計GMNs所需的四個必要步驟和一個可選步驟,我們選擇了(1)鄰域標記化,(2)標記順序,(3)雙向選擇性SSM編碼器架構,(4)局部編碼,以及可有可無的(5)PE和SE。我們進一步提供了GMNs強大性能的理論證明。實驗表明,儘管計算成本遠低於其他方法,GMNs在長程、小規模、大規模和異質基準數據集上取得了優異的性能。
English
Graph Neural Networks (GNNs) have shown promising potential in graph
representation learning. The majority of GNNs define a local message-passing
mechanism, propagating information over the graph by stacking multiple layers.
These methods, however, are known to suffer from two major limitations:
over-squashing and poor capturing of long-range dependencies. Recently, Graph
Transformers (GTs) emerged as a powerful alternative to Message-Passing Neural
Networks (MPNNs). GTs, however, have quadratic computational cost, lack
inductive biases on graph structures, and rely on complex Positional/Structural
Encodings (SE/PE). In this paper, we show that while Transformers, complex
message-passing, and SE/PE are sufficient for good performance in practice,
neither is necessary. Motivated by the recent success of State Space Models
(SSMs), such as Mamba, we present Graph Mamba Networks (GMNs), a general
framework for a new class of GNNs based on selective SSMs. We discuss and
categorize the new challenges when adopting SSMs to graph-structured data, and
present four required and one optional steps to design GMNs, where we choose
(1) Neighborhood Tokenization, (2) Token Ordering, (3) Architecture of
Bidirectional Selective SSM Encoder, (4) Local Encoding, and dispensable (5) PE
and SE. We further provide theoretical justification for the power of GMNs.
Experiments demonstrate that despite much less computational cost, GMNs attain
an outstanding performance in long-range, small-scale, large-scale, and
heterophilic benchmark datasets.Summary
AI-Generated Summary