모자(Mozi): 약물 발견 LLM 에이전트를 위한 통제된 자율성

초록

도구 강화 대규모 언어 모델(LLM) 에이전트는 과학적 추론과 계산을 통합할 것을 약속하지만, 약물 발견과 같은 고위험 영역에서의 배치는 두 가지 중요한 장벽에 의해 제한되고 있습니다: 제약되지 않은 도구 사용 통제와 낮은 장기적 신뢰성입니다. 의존성이 높은 제약 파이프라인에서 자율 에이전트는 종종 재현 불가능한 궤적으로 이탈하며, 초기 단계의 환상(잘못된 정보)이 누적되어 하류 단계의 실패로 증폭됩니다. 이를 극복하기 위해 생성형 AI의 유연성과 계산 생물학의 결정론적 엄격함을 연결하는 이중 계층 아키텍처인 Mozi를 제시합니다. A 계층(제어 평면)은 역할 기반 도구 격리를 강제하고, 제한된 행동 공간으로 실행을 제한하며, 성찰 기반 재계획을 주도하는 통제된 관리자-작업자 계층 구조를 구축합니다. B 계층(워크플로우 평면)은 표적 확인부터 선도물질 최적화까지의 표준적인 약물 발견 단계를 상태를 가진 구성 가능한 스킬 그래프로 운영합니다. 이 계층은 엄격한 데이터 계약과 전략적 인간 참여(HITL) 검증점을 통합하여 높은 불확실성 의사 결정 경계에서 과학적 타당성을 보호합니다. "안전한 작업에는 자유 형식 추론, 장기적 파이프라인에는 구조화된 실행"이라는 설계 원칙에 따라 작동하는 Mozi는 내장된 견고성 메커니즘과 추적 수준의 감사 기능을 제공하여 오류 누적을 완전히 방지합니다. 우리는 Mozi를 생의학 에이전트용으로 구성된 벤치마크인 PharmaBench에서 평가하여 기존 기준선보다 우수한 오케스트레이션 정확도를 입증했습니다. 나아가 엔드투엔드 치료제 사례 연구를 통해 Mozi가 방대한 화학 공간을 탐색하고, 엄격한 독성 필터를 적용하며, 매우 경쟁력 있는 실리코 후보 물질을 생성하는 능력을 입증하여 LLM을 취약한 대화 상대에서 신뢰할 수 있고 통제된 공동 과학자로 효과적으로 변모시킵니다.

English

Tool-augmented large language model (LLM) agents promise to unify scientific reasoning with computation, yet their deployment in high-stakes domains like drug discovery is bottlenecked by two critical barriers: unconstrained tool-use governance and poor long-horizon reliability. In dependency-heavy pharmaceutical pipelines, autonomous agents often drift into irreproducible trajectories, where early-stage hallucinations multiplicatively compound into downstream failures. To overcome this, we present Mozi, a dual-layer architecture that bridges the flexibility of generative AI with the deterministic rigor of computational biology. Layer A (Control Plane) establishes a governed supervisor--worker hierarchy that enforces role-based tool isolation, limits execution to constrained action spaces, and drives reflection-based replanning. Layer B (Workflow Plane) operationalizes canonical drug discovery stages -- from Target Identification to Lead Optimization -- as stateful, composable skill graphs. This layer integrates strict data contracts and strategic human-in-the-loop (HITL) checkpoints to safeguard scientific validity at high-uncertainty decision boundaries. Operating on the design principle of ``free-form reasoning for safe tasks, structured execution for long-horizon pipelines,'' Mozi provides built-in robustness mechanisms and trace-level audibility to completely mitigate error accumulation. We evaluate Mozi on PharmaBench, a curated benchmark for biomedical agents, demonstrating superior orchestration accuracy over existing baselines. Furthermore, through end-to-end therapeutic case studies, we demonstrate Mozi's ability to navigate massive chemical spaces, enforce stringent toxicity filters, and generate highly competitive in silico candidates, effectively transforming the LLM from a fragile conversationalist into a reliable, governed co-scientist.

모자(Mozi): 약물 발견 LLM 에이전트를 위한 통제된 자율성

Mozi: Governed Autonomy for Drug Discovery LLM Agents

초록

Support