ChatPaper.aiChatPaper

GenoMAS:一個基於代碼驅動基因表達分析的多智能體科學發現框架

GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis

July 28, 2025
作者: Haoyang Liu, Yijiang Li, Haohan Wang
cs.AI

摘要

基因表達分析是許多生物醫學發現的關鍵,然而從原始轉錄組數據中提取洞見仍然是一項艱巨的任務,這源於多個大型半結構化文件的複雜性以及對廣泛領域專業知識的需求。當前的自動化方法往往受限於在邊緣情況下易於崩潰的僵化工作流程,或缺乏嚴謹科學探究所需精確性的完全自主代理。GenoMAS開闢了一條不同的道路,它展示了一個基於大語言模型(LLM)的科學家團隊,將結構化工作流程的可靠性與自主代理的適應性相結合。GenoMAS通過類型化的消息傳遞協議協調六個專業化的LLM代理,每個代理都在共享的分析畫布上貢獻互補的優勢。GenoMAS的核心是一個引導規劃框架:編程代理將高層次任務指南展開為行動單元,並在每個節點選擇推進、修訂、繞過或回溯,從而保持邏輯連貫性,同時靈活應對基因組數據的特殊性。 在GenoTEX基準測試中,GenoMAS在數據預處理方面達到了89.13%的綜合相似性相關性,在基因識別方面達到了60.48%的F_1分數,分別超過了之前最佳技術10.61%和16.85%。除了指標之外,GenoMAS還揭示了文獻支持的生物學上可信的基因-表型關聯,同時調整了潛在的混雜因素。代碼可在https://github.com/Liu-Hy/GenoMAS獲取。
English
Gene expression analysis holds the key to many biomedical discoveries, yet extracting insights from raw transcriptomic data remains formidable due to the complexity of multiple large, semi-structured files and the need for extensive domain expertise. Current automation approaches are often limited by either inflexible workflows that break down in edge cases or by fully autonomous agents that lack the necessary precision for rigorous scientific inquiry. GenoMAS charts a different course by presenting a team of LLM-based scientists that integrates the reliability of structured workflows with the adaptability of autonomous agents. GenoMAS orchestrates six specialized LLM agents through typed message-passing protocols, each contributing complementary strengths to a shared analytic canvas. At the heart of GenoMAS lies a guided-planning framework: programming agents unfold high-level task guidelines into Action Units and, at each juncture, elect to advance, revise, bypass, or backtrack, thereby maintaining logical coherence while bending gracefully to the idiosyncrasies of genomic data. On the GenoTEX benchmark, GenoMAS reaches a Composite Similarity Correlation of 89.13% for data preprocessing and an F_1 of 60.48% for gene identification, surpassing the best prior art by 10.61% and 16.85% respectively. Beyond metrics, GenoMAS surfaces biologically plausible gene-phenotype associations corroborated by the literature, all while adjusting for latent confounders. Code is available at https://github.com/Liu-Hy/GenoMAS.
PDF12July 29, 2025