fMRI-3D:用于增强基于fMRI的3D重建的综合数据集
fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction
September 17, 2024
作者: Jianxiong Gao, Yuqian Fu, Yun Wang, Xuelin Qian, Jianfeng Feng, Yanwei Fu
cs.AI
摘要
从功能性磁共振成像(fMRI)数据重建3D视觉,我们在会议工作中提出的Recon3DMind,对认知神经科学和计算机视觉都具有重要意义。为推进这一任务,我们介绍了fMRI-3D数据集,包括来自15名参与者的数据,展示了总共4768个3D对象。该数据集包括两个组成部分:fMRI-Shape,此前已介绍并可在https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape获取,以及本文提出的fMRI-Objaverse,可在https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse获取。fMRI-Objaverse包括来自5名受试者的数据,其中4名也是fMRI-Shape中核心集的一部分,每名受试者查看了117个类别中的3142个3D对象,所有对象都附带文本说明。这显著增强了数据集的多样性和潜在应用。此外,我们提出了MinD-3D,这是一个新颖的框架,旨在从fMRI信号中解码3D视觉信息。该框架首先使用神经融合编码器从fMRI数据中提取和聚合特征,然后利用特征桥扩散模型生成视觉特征,最后使用生成式变换器解码器重建3D对象。我们通过设计语义和结构层面的指标来建立新的基准,以评估模型性能。此外,我们评估了我们的模型在分布外设置中的有效性,并分析了从fMRI信号中提取特征和视觉ROI的归因。我们的实验表明,MinD-3D不仅能以高语义和空间准确性重建3D对象,还能加深我们对人脑如何处理3D视觉信息的理解。项目页面:https://jianxgao.github.io/MinD-3D。
English
Reconstructing 3D visuals from functional Magnetic Resonance Imaging (fMRI)
data, introduced as Recon3DMind in our conference work, is of significant
interest to both cognitive neuroscience and computer vision. To advance this
task, we present the fMRI-3D dataset, which includes data from 15 participants
and showcases a total of 4768 3D objects. The dataset comprises two components:
fMRI-Shape, previously introduced and accessible at
https://huggingface.co/datasets/Fudan-fMRI/fMRI-Shape, and fMRI-Objaverse,
proposed in this paper and available at
https://huggingface.co/datasets/Fudan-fMRI/fMRI-Objaverse. fMRI-Objaverse
includes data from 5 subjects, 4 of whom are also part of the Core set in
fMRI-Shape, with each subject viewing 3142 3D objects across 117 categories,
all accompanied by text captions. This significantly enhances the diversity and
potential applications of the dataset. Additionally, we propose MinD-3D, a
novel framework designed to decode 3D visual information from fMRI signals. The
framework first extracts and aggregates features from fMRI data using a
neuro-fusion encoder, then employs a feature-bridge diffusion model to generate
visual features, and finally reconstructs the 3D object using a generative
transformer decoder. We establish new benchmarks by designing metrics at both
semantic and structural levels to evaluate model performance. Furthermore, we
assess our model's effectiveness in an Out-of-Distribution setting and analyze
the attribution of the extracted features and the visual ROIs in fMRI signals.
Our experiments demonstrate that MinD-3D not only reconstructs 3D objects with
high semantic and spatial accuracy but also deepens our understanding of how
human brain processes 3D visual information. Project page at:
https://jianxgao.github.io/MinD-3D.Summary
AI-Generated Summary