不変グラフトランスフォーマー

要旨

根拠発見（Rationale discovery）とは、下流タスクの予測を最大限に支持する入力データの部分集合を見つけることと定義されます。グラフ機械学習の文脈では、グラフ根拠（graph rationale）は、与えられたグラフトポロジー内の重要な部分グラフを特定するものと定義され、これが予測結果を根本的に決定します。根拠部分グラフとは対照的に、残りの部分グラフは環境部分グラフ（environment subgraph）と呼ばれます。グラフの根拠化（graph rationalization）は、グラフ根拠と予測ラベルとの間のマッピングが不変であると仮定されるため、モデルの性能を向上させることができます。抽出された根拠部分グラフの識別力を確保するために、「介入（intervention）」と呼ばれる重要な技術が適用されます。介入の核心的な考え方は、環境部分グラフがどのように変化しても、根拠部分グラフからの意味が不変であり、これが正しい予測結果を保証するというものです。しかし、既存のグラフデータに対する根拠化研究のほとんど、あるいは全てが、グラフレベルでの介入戦略を開発しており、これは粗粒度なものです。本論文では、グラフデータに対して適切に調整された介入戦略を提案します。私たちのアイデアは、Transformerモデルの発展に基づいており、その自己注意（self-attention）モジュールが入力ノード間の豊富な相互作用を提供します。自己注意モジュールに基づいて、私たちが提案する不変グラフTransformer（IGT）は、細粒度、具体的にはノードレベルおよび仮想ノードレベルの介入を実現できます。私たちの包括的な実験では、7つの実世界のデータセットを使用し、提案されたIGTは13のベースラインメソッドと比較して顕著な性能優位性を示しました。

English

Rationale discovery is defined as finding a subset of the input data that maximally supports the prediction of downstream tasks. In graph machine learning context, graph rationale is defined to locate the critical subgraph in the given graph topology, which fundamentally determines the prediction results. In contrast to the rationale subgraph, the remaining subgraph is named the environment subgraph. Graph rationalization can enhance the model performance as the mapping between the graph rationale and prediction label is viewed as invariant, by assumption. To ensure the discriminative power of the extracted rationale subgraphs, a key technique named "intervention" is applied. The core idea of intervention is that given any changing environment subgraphs, the semantics from the rationale subgraph is invariant, which guarantees the correct prediction result. However, most, if not all, of the existing rationalization works on graph data develop their intervention strategies on the graph level, which is coarse-grained. In this paper, we propose well-tailored intervention strategies on graph data. Our idea is driven by the development of Transformer models, whose self-attention module provides rich interactions between input nodes. Based on the self-attention module, our proposed invariant graph Transformer (IGT) can achieve fine-grained, more specifically, node-level and virtual node-level intervention. Our comprehensive experiments involve 7 real-world datasets, and the proposed IGT shows significant performance advantages compared to 13 baseline methods.

不変グラフトランスフォーマー

Invariant Graph Transformer

要旨

Support