活性化から因果性へ：人間の脳における因果的視覚表現の発見

要旨

人間の脳において視覚的概念を表象する脳領域を特定することは、神経科学における中心的な課題である。既存の手法は、活性化最大化を通じて大まかな機能領域（例えば顔、場所）を特定し、他の概念と比較して対象概念に対して強く活性化する領域を同定してきた。しかし、強い活性化だけではその領域が概念そのものを表象するとは断定できない。なぜなら反応は、相関する視覚的または意味的手がかりによって引き起こされている可能性があるからである。我々はBrainCauseを導入する。これは生成モデルと脳モデルを組み合わせて制御された刺激を合成し、標的を絞った因果テストを通じて神経表象を検証する自動化フレームワークである。関心のある概念を指定するクエリが与えられると、我々のフレームワークは標的刺激セットを構築する。このセットは、概念画像、他の画像内容を保持しながら対象概念を除去した反実仮想的編集、および候補となる相関ディストラクタを含む画像から構成される。次に、画像からfMRIへのエンコーディングモデルを用いて脳反応を予測し、相関する代替概念よりも対象概念に特異的に応答する表象を探索する。BrainCauseは検証された候補表象を返し、その発見をさらにテストまたは拡張するための追跡fMRI実験を提案する。我々の手法は、既知の機能局在を再現し、数十の概念にわたって新たな候補表象を同定することに成功し、予測および測定されたfMRIデータの両方で検証された。重要なことに、因果検証なしでは局在の大部分が偽陽性となることを示し、活性化だけでは表象の証拠として不十分であることを確認した。

English

Identifying which brain regions represent a visual concept in the human brain is a central challenge in neuroscience. Existing approaches have localized coarse functional regions (e.g., faces, places) through activation maximization, identifying regions that activate strongly for a target concept relative to other concepts. Yet strong activation alone does not establish that a region represents the concept itself, as responses may instead be driven by correlated visual or semantic cues. We introduce BrainCause, an automated framework that combines generative and brain models to synthesize controlled stimuli and validate neural representations through targeted causal testing. Given a query specifying a concept of interest, our framework constructs targeted stimulus sets comprising concept images, counterfactual edits that remove the target concept while preserving other image content, and images with candidate correlated distractors. It then uses an image-to-fMRI encoding model to predict brain responses and searches for representations that respond specifically to the target concept over correlated alternatives. BrainCause returns validated candidate representations and proposes follow-up fMRI experiments to further test or extend its discoveries. Our approach successfully recovers known functional localizations and identifies new candidate representations across dozens of concepts, validated on both predicted and measured fMRI data. Critically, we show that without causal validation, a large fraction of localizations would be false positives, confirming that activation alone is insufficient evidence of representation.