墨子：創薬発見LLMエージェントの自律的統治

要旨

ツール拡張型大規模言語モデル（LLM）エージェントは、科学的推論と計算を統合する可能性を秘めているものの、医薬品発見のような高リスク領域への展開は、2つの重大な障壁によって妨げられている。すなわち、制約のないツール使用のガバナンスと、長期にわたる信頼性の低さである。依存関係の複雑な医薬品開発パイプラインでは、自律エージェントはしばしば再現不可能な軌道に逸脱し、初期段階での虚構的生成が乗数的に下流の失敗へと波及する。この問題を克服するため、我々は生成AIの柔軟性と計算生物学の確定的な厳密性を架橋する二層アーキテクチャ「Mozi」を提案する。A層（制御プレーン）は、管理されたスーパーバイザー・ワーカー階層を確立し、ロールベースのツール分離を強制、実行を制約された行動空間に限定し、反射型再計画を推進する。B層（ワークフロープレーン）は、標準的な医薬品発見の段階（標的同定からリード最適化まで）を、状態を保持する構成可能なスキルグラフとして運用する。この層は、厳格なデータ契約と戦略的なヒューマンインザループ（HITL）チェックポイントを統合し、不確実性の高い意思決定境界において科学的妥当性を保護する。「安全なタスクには自由形式の推論を、長期パイプラインには構造化された実行を」という設計原則に基づいて動作するMoziは、組み込みの堅牢性メカニズムとトレースレベルの監査可能性を提供し、誤差の蓄積を完全に軽減する。我々は、生物医学エージェント向けに精選されたベンチマークであるPharmaBench上でMoziを評価し、既存のベースラインを上回る優れたオーケストレーション精度を実証する。さらに、エンドツーエンドの創薬ケーススタディを通じて、Moziが膨大な化学空間を探索し、厳格な毒性フィルターを適用し、極めて競争力のあるin silico候補化合物を生成する能力を実証する。これにより、LLMは脆弱な対話相手から、信頼性の高い管理された共同研究者へと変貌を遂げる。

English

Tool-augmented large language model (LLM) agents promise to unify scientific reasoning with computation, yet their deployment in high-stakes domains like drug discovery is bottlenecked by two critical barriers: unconstrained tool-use governance and poor long-horizon reliability. In dependency-heavy pharmaceutical pipelines, autonomous agents often drift into irreproducible trajectories, where early-stage hallucinations multiplicatively compound into downstream failures. To overcome this, we present Mozi, a dual-layer architecture that bridges the flexibility of generative AI with the deterministic rigor of computational biology. Layer A (Control Plane) establishes a governed supervisor--worker hierarchy that enforces role-based tool isolation, limits execution to constrained action spaces, and drives reflection-based replanning. Layer B (Workflow Plane) operationalizes canonical drug discovery stages -- from Target Identification to Lead Optimization -- as stateful, composable skill graphs. This layer integrates strict data contracts and strategic human-in-the-loop (HITL) checkpoints to safeguard scientific validity at high-uncertainty decision boundaries. Operating on the design principle of ``free-form reasoning for safe tasks, structured execution for long-horizon pipelines,'' Mozi provides built-in robustness mechanisms and trace-level audibility to completely mitigate error accumulation. We evaluate Mozi on PharmaBench, a curated benchmark for biomedical agents, demonstrating superior orchestration accuracy over existing baselines. Furthermore, through end-to-end therapeutic case studies, we demonstrate Mozi's ability to navigate massive chemical spaces, enforce stringent toxicity filters, and generate highly competitive in silico candidates, effectively transforming the LLM from a fragile conversationalist into a reliable, governed co-scientist.

墨子：創薬発見LLMエージェントの自律的統治

Mozi: Governed Autonomy for Drug Discovery LLM Agents

要旨

Support