自然图像自编码器能否紧凑标记fMRI体积以实现长程动力学建模?
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling?
April 4, 2026
作者: Peter Yongho Kim, Juhyeon Park, Jungwoo Park, Jubin Choi, Jungwoo Seo, Jiook Cha, Taesup Moon
cs.AI
摘要
由於四維訊號的高維度特性,建模功能性磁振造影(fMRI)中的長程時空動態仍是關鍵挑戰。現有的基於體素的模型雖展現出優異性能與解釋能力,卻受制於巨大的記憶體需求,僅能捕捉有限時間窗口的資訊。為解決此問題,我們提出TABLeT(二維自編碼腦潛在轉換器),該創新方法透過預訓練的二維自然影像自編碼器對fMRI體積進行標記化處理。每個3D fMRI體積被壓縮為一組緊湊的連續標記,使僅具備有限顯存的基本Transformer編碼器能夠實現長序列建模。在英國生物銀行(UKB)、人類連接組計劃(HCP)及ADHD-200等大規模基準測試中,TABLeT在多項任務上表現優於現有模型,並在相同輸入條件下相較最先進的體素方法顯著提升計算與記憶體效率。此外,我們開發了自監督遮蔽標記建模方法對TABLeT進行預訓練,從而提升模型在各類下游任務的表現。本研究為實現可擴展且具解釋性的大腦活動時空建模開闢了新途徑。程式碼已開源於:https://github.com/beotborry/TABLeT。
English
Modeling long-range spatiotemporal dynamics in functional Magnetic Resonance Imaging (fMRI) remains a key challenge due to the high dimensionality of the four-dimensional signals. Prior voxel-based models, although demonstrating excellent performance and interpretation capabilities, are constrained by prohibitive memory demands and thus can only capture limited temporal windows. To address this, we propose TABLeT (Two-dimensionally Autoencoded Brain Latent Transformer), a novel approach that tokenizes fMRI volumes using a pre-trained 2D natural image autoencoder. Each 3D fMRI volume is compressed into a compact set of continuous tokens, enabling long-sequence modeling with a simple Transformer encoder with limited VRAM. Across large-scale benchmarks including the UK-Biobank (UKB), Human Connectome Project (HCP), and ADHD-200 datasets, TABLeT outperforms existing models in multiple tasks, while demonstrating substantial gains in computational and memory efficiency over the state-of-the-art voxel-based method given the same input. Furthermore, we develop a self-supervised masked token modeling approach to pre-train TABLeT, which improves the model's performance for various downstream tasks. Our findings suggest a promising approach for scalable and interpretable spatiotemporal modeling of brain activity. Our code is available at https://github.com/beotborry/TABLeT.