图马巴:走向利用状态空间模型在图上学习
Graph Mamba: Towards Learning on Graphs with State Space Models
February 13, 2024
作者: Ali Behrouz, Farnoosh Hashemi
cs.AI
摘要
图神经网络(GNNs)在图表示学习中展现出了巨大的潜力。大多数GNNs定义了一种局部消息传递机制,通过堆叠多个层在图上传播信息。然而,这些方法已知存在两个主要限制:过度压缩和较差的捕获远程依赖性能力。最近,图变换器(GTs)作为消息传递神经网络(MPNNs)的一个强大替代方案出现。然而,GTs的计算成本是二次的,缺乏对图结构的归纳偏差,并依赖于复杂的位置/结构编码(SE/PE)。本文展示了尽管变换器、复杂消息传递和SE/PE在实践中表现良好,但两者都不是必需的。受到最近Mamba等状态空间模型(SSMs)的成功启发,我们提出了图Mamba网络(GMNs),这是一种基于选择性SSMs的新型GNNs的通用框架。我们讨论并分类了在将SSMs应用于图结构数据时面临的新挑战,并提出了设计GMNs所需的四个必要步骤和一个可选步骤,我们选择了(1)邻域标记化,(2)标记排序,(3)双向选择性SSM编码器的架构,(4)局部编码,以及可有可无的(5)PE和SE。我们进一步为GMNs的强大性能提供了理论上的证明。实验证明,尽管计算成本大大降低,GMNs在远程、小规模、大规模和异质基准数据集上取得了出色的性能。
English
Graph Neural Networks (GNNs) have shown promising potential in graph
representation learning. The majority of GNNs define a local message-passing
mechanism, propagating information over the graph by stacking multiple layers.
These methods, however, are known to suffer from two major limitations:
over-squashing and poor capturing of long-range dependencies. Recently, Graph
Transformers (GTs) emerged as a powerful alternative to Message-Passing Neural
Networks (MPNNs). GTs, however, have quadratic computational cost, lack
inductive biases on graph structures, and rely on complex Positional/Structural
Encodings (SE/PE). In this paper, we show that while Transformers, complex
message-passing, and SE/PE are sufficient for good performance in practice,
neither is necessary. Motivated by the recent success of State Space Models
(SSMs), such as Mamba, we present Graph Mamba Networks (GMNs), a general
framework for a new class of GNNs based on selective SSMs. We discuss and
categorize the new challenges when adopting SSMs to graph-structured data, and
present four required and one optional steps to design GMNs, where we choose
(1) Neighborhood Tokenization, (2) Token Ordering, (3) Architecture of
Bidirectional Selective SSM Encoder, (4) Local Encoding, and dispensable (5) PE
and SE. We further provide theoretical justification for the power of GMNs.
Experiments demonstrate that despite much less computational cost, GMNs attain
an outstanding performance in long-range, small-scale, large-scale, and
heterophilic benchmark datasets.Summary
AI-Generated Summary