TxAgent: ツールのユニバースを横断する治療的推論のためのAIエージェント

要旨

精密治療には、個別化された治療推奨を生成する多モーダル適応モデルが必要です。本論文では、TxAgentを紹介します。これは、211のツールからなるツールボックスを活用し、多段階推論とリアルタイムの生物医学的知識検索を行い、薬物相互作用、禁忌症、患者固有の治療戦略を分析するAIエージェントです。TxAgentは、薬物が分子レベル、薬物動態レベル、臨床レベルでどのように相互作用するかを評価し、患者の併存疾患や併用薬に基づいて禁忌症を特定し、個々の患者の特性に合わせて治療戦略を調整します。複数の生物医学的ソースから証拠を検索・統合し、薬物と患者の状態間の相互作用を評価し、反復的な推論を通じて治療推奨を洗練させます。タスクの目的に基づいてツールを選択し、構造化された関数呼び出しを実行して、臨床推論とクロスソース検証を必要とする治療タスクを解決します。ToolUniverseは、1939年以降の米国FDA承認薬やOpen Targetsからの検証済み臨床知見を含む、信頼できるソースからの211のツールを統合しています。TxAgentは、5つの新しいベンチマーク（DrugPC、BrandPC、GenericPC、TreatmentPC、DescriptionPC）において、3,168の薬物推論タスクと456の個別化治療シナリオをカバーし、主要なLLM、ツール使用モデル、推論エージェントを上回る性能を示しました。オープンエンドの薬物推論タスクでは92.1%の精度を達成し、GPT-4oを上回り、構造化された多段階推論ではDeepSeek-R1（671B）を凌駕しました。TxAgentは、薬物名のバリエーションや説明にわたって一般化します。多段階推論、リアルタイムの知識基盤、ツール支援意思決定を統合することにより、TxAgentは治療推奨が確立された臨床ガイドラインと実世界の証拠に沿っていることを保証し、有害事象のリスクを低減し、治療意思決定を改善します。

English

Precision therapeutics require multimodal adaptive models that generate personalized treatment recommendations. We introduce TxAgent, an AI agent that leverages multi-step reasoning and real-time biomedical knowledge retrieval across a toolbox of 211 tools to analyze drug interactions, contraindications, and patient-specific treatment strategies. TxAgent evaluates how drugs interact at molecular, pharmacokinetic, and clinical levels, identifies contraindications based on patient comorbidities and concurrent medications, and tailors treatment strategies to individual patient characteristics. It retrieves and synthesizes evidence from multiple biomedical sources, assesses interactions between drugs and patient conditions, and refines treatment recommendations through iterative reasoning. It selects tools based on task objectives and executes structured function calls to solve therapeutic tasks that require clinical reasoning and cross-source validation. The ToolUniverse consolidates 211 tools from trusted sources, including all US FDA-approved drugs since 1939 and validated clinical insights from Open Targets. TxAgent outperforms leading LLMs, tool-use models, and reasoning agents across five new benchmarks: DrugPC, BrandPC, GenericPC, TreatmentPC, and DescriptionPC, covering 3,168 drug reasoning tasks and 456 personalized treatment scenarios. It achieves 92.1% accuracy in open-ended drug reasoning tasks, surpassing GPT-4o and outperforming DeepSeek-R1 (671B) in structured multi-step reasoning. TxAgent generalizes across drug name variants and descriptions. By integrating multi-step inference, real-time knowledge grounding, and tool-assisted decision-making, TxAgent ensures that treatment recommendations align with established clinical guidelines and real-world evidence, reducing the risk of adverse events and improving therapeutic decision-making.