多目標引導的離散流匹配用於可控生物序列設計
Multi-Objective-Guided Discrete Flow Matching for Controllable Biological Sequence Design
May 11, 2025
作者: Tong Chen, Yinuo Zhang, Sophia Tang, Pranam Chatterjee
cs.AI
摘要
設計滿足多個且往往相互衝突的功能與生物物理標準的生物序列,仍然是生物分子工程中的核心挑戰。雖然離散流匹配模型最近在高維序列空間中的高效採樣方面顯示出潛力,但現有方法僅針對單一目標,或需要可能扭曲離散分佈的連續嵌入。我們提出了多目標引導的離散流匹配(MOG-DFM),這是一個通用框架,用於引導任何預訓練的離散時間流匹配生成器在多個標量目標之間實現帕累托有效的權衡。在每個採樣步驟中,MOG-DFM計算候選轉移的混合排名方向分數,並應用自適應超錐過濾器來確保一致的多目標進展。我們還訓練了兩個無條件的離散流匹配模型,PepDFM用於多樣化肽生成,EnhancerDFM用於功能性增強子DNA生成,作為MOG-DFM的基礎生成模型。我們展示了MOG-DFM在生成跨五種特性(溶血性、防污性、溶解性、半衰期和結合親和力)優化的肽結合劑,以及設計具有特定增強子類別和DNA形狀的DNA序列方面的有效性。總體而言,MOG-DFM被證明是一個強大的工具,用於多屬性引導的生物分子序列設計。
English
Designing biological sequences that satisfy multiple, often conflicting,
functional and biophysical criteria remains a central challenge in biomolecule
engineering. While discrete flow matching models have recently shown promise
for efficient sampling in high-dimensional sequence spaces, existing approaches
address only single objectives or require continuous embeddings that can
distort discrete distributions. We present Multi-Objective-Guided Discrete Flow
Matching (MOG-DFM), a general framework to steer any pretrained discrete-time
flow matching generator toward Pareto-efficient trade-offs across multiple
scalar objectives. At each sampling step, MOG-DFM computes a hybrid
rank-directional score for candidate transitions and applies an adaptive
hypercone filter to enforce consistent multi-objective progression. We also
trained two unconditional discrete flow matching models, PepDFM for diverse
peptide generation and EnhancerDFM for functional enhancer DNA generation, as
base generation models for MOG-DFM. We demonstrate MOG-DFM's effectiveness in
generating peptide binders optimized across five properties (hemolysis,
non-fouling, solubility, half-life, and binding affinity), and in designing DNA
sequences with specific enhancer classes and DNA shapes. In total, MOG-DFM
proves to be a powerful tool for multi-property-guided biomolecule sequence
design.Summary
AI-Generated Summary