Brain-IT: Ricostruzione di Immagini da fMRI tramite Brain-Interaction Transformer

Abstract

La ricostruzione di immagini viste da persone a partire dalle loro registrazioni fMRI cerebrali fornisce una finestra non invasiva sul cervello umano. Nonostante i recenti progressi resi possibili dai modelli di diffusione, i metodi attuali spesso mancano di fedeltà rispetto alle immagini effettivamente viste. Presentiamo "Brain-IT", un approccio ispirato al cervello che affronta questa sfida attraverso un Transformer per l'Interazione Cerebrale (BIT), consentendo interazioni efficaci tra cluster di voxel cerebrali funzionalmente simili. Questi cluster funzionali sono condivisi da tutti i soggetti e fungono da elementi costitutivi per integrare le informazioni sia all'interno che tra i cervelli. Tutti i componenti del modello sono condivisi da tutti i cluster e soggetti, consentendo un addestramento efficiente con una quantità limitata di dati. Per guidare la ricostruzione dell'immagine, BIT predice due caratteristiche di immagine localizzate a livello di patch, complementari: (i) caratteristiche semantiche di alto livello che indirizzano il modello di diffusione verso il corretto contenuto semantico dell'immagine; e (ii) caratteristiche strutturali di basso livello che aiutano a inizializzare il processo di diffusione con la corretta struttura generale dell'immagine. La progettazione di BIT consente un flusso diretto di informazioni dai cluster di voxel cerebrali alle caratteristiche di immagine localizzate. Attraverso questi principi, il nostro metodo ottiene ricostruzioni di immagini da fMRI che ricostruiscono fedelmente le immagini viste e supera gli approcci allo stato dell'arte sia visivamente che mediante metriche oggettive standard. Inoltre, con soli 1 ora di dati fMRI da un nuovo soggetto, otteniamo risultati paragonabili ai metodi attuali addestrati su registrazioni complete di 40 ore.

English

Reconstructing images seen by people from their fMRI brain recordings provides a non-invasive window into the human brain. Despite recent progress enabled by diffusion models, current methods often lack faithfulness to the actual seen images. We present "Brain-IT", a brain-inspired approach that addresses this challenge through a Brain Interaction Transformer (BIT), allowing effective interactions between clusters of functionally-similar brain-voxels. These functional-clusters are shared by all subjects, serving as building blocks for integrating information both within and across brains. All model components are shared by all clusters & subjects, allowing efficient training with a limited amount of data. To guide the image reconstruction, BIT predicts two complementary localized patch-level image features: (i)high-level semantic features which steer the diffusion model toward the correct semantic content of the image; and (ii)low-level structural features which help to initialize the diffusion process with the correct coarse layout of the image. BIT's design enables direct flow of information from brain-voxel clusters to localized image features. Through these principles, our method achieves image reconstructions from fMRI that faithfully reconstruct the seen images, and surpass current SotA approaches both visually and by standard objective metrics. Moreover, with only 1-hour of fMRI data from a new subject, we achieve results comparable to current methods trained on full 40-hour recordings.

Brain-IT: Ricostruzione di Immagini da fMRI tramite Brain-Interaction Transformer

Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer

Abstract

Support