マルチモーダル生体医学データを用いた解釈可能な数ショットアルツハイマー病予測のための表形式LLM

要旨

アルツハイマー病（AD）の正確な診断には、表形式のバイオマーカーデータの扱いが不可欠である。しかし、そのようなデータは小規模で不完全な場合が多く、深層学習モデルが古典的手法を凌駕できないことが多い。事前学習済み大規模言語モデル（LLM）は、数ショット一般化、構造化推論、解釈可能な出力を提供し、臨床予測に強力なパラダイムシフトをもたらす。我々は、TableGPT2を基盤とし、平文ではなく表形式プロンプトを用いて数ショットAD分類に微調整した、ドメイン適応型表形式LLMフレームワーク「TAP-GPT」を提案する。TAP-GPTを4つのADNI由来データセット（QT-PADバイオマーカー、領域レベル構造MRI、アミロイドPET、タウPETを用いた二値AD分類を含む）で評価した。マルチモーダル及びユニモーダル設定において、TAP-GPTは基盤モデルを改善し、数ショット設定で従来の機械学習ベースラインを上回り、汎用LLMの最新モデルとも遜色ない性能を示した。特徴量選択が高次元入力における性能劣化を軽減すること、およびTAP-GPTが代入処理なしで模擬的及び実世界の欠損下でも安定した性能を維持することを示す。さらに、TAP-GPTは確立されたAD生物学に沿った構造化されたモダリティ認識推論を生成し、自己内省下でより高い安定性を示し、反復型マルチエージェントシステムでの利用を支持する。我々の知る限り、表形式専門LLMをマルチモーダルバイオマーカーに基づくAD予測に体系的に応用した本研究は初めてであり、このような事前学習モデルが構造化臨床予測タスクに有効に対処し得ること、および表形式LLM駆動のマルチエージェント臨床意思決定支援システムの基盤を築くことを実証する。ソースコードはGitHubで公開されている：https://github.com/sophie-kearney/TAP-GPT。

English

Accurate diagnosis of Alzheimer's disease (AD) requires handling tabular biomarker data, yet such data are often small and incomplete, where deep learning models frequently fail to outperform classical methods. Pretrained large language models (LLMs) offer few-shot generalization, structured reasoning, and interpretable outputs, providing a powerful paradigm shift for clinical prediction. We propose TAP-GPT Tabular Alzheimer's Prediction GPT, a domain-adapted tabular LLM framework built on TableGPT2 and fine-tuned for few-shot AD classification using tabular prompts rather than plain texts. We evaluate TAP-GPT across four ADNI-derived datasets, including QT-PAD biomarkers and region-level structural MRI, amyloid PET, and tau PET for binary AD classification. Across multimodal and unimodal settings, TAP-GPT improves upon its backbone models and outperforms traditional machine learning baselines in the few-shot setting while remaining competitive with state-of-the-art general-purpose LLMs. We show that feature selection mitigates degradation in high-dimensional inputs and that TAP-GPT maintains stable performance under simulated and real-world missingness without imputation. Additionally, TAP-GPT produces structured, modality-aware reasoning aligned with established AD biology and shows greater stability under self-reflection, supporting its use in iterative multi-agent systems. To our knowledge, this is the first systematic application of a tabular-specialized LLM to multimodal biomarker-based AD prediction, demonstrating that such pretrained models can effectively address structured clinical prediction tasks and laying the foundation for tabular LLM-driven multi-agent clinical decision-support systems. The source code is publicly available on GitHub: https://github.com/sophie-kearney/TAP-GPT.

マルチモーダル生体医学データを用いた解釈可能な数ショットアルツハイマー病予測のための表形式LLM

Tabular LLMs for Interpretable Few-Shot Alzheimer's Disease Prediction with Multimodal Biomedical Data

要旨

Support