ChatPaper.aiChatPaper

M^4olGen:多属性精准约束下的多智能体分阶段分子生成框架

M^4olGen: Multi-Agent, Multi-Stage Molecular Generation under Precise Multi-Property Constraints

January 15, 2026
作者: Yizhan Li, Florence Cloutier, Sifan Wu, Ali Parviz, Boris Knyazev, Yan Zhang, Glen Berseth, Bang Liu
cs.AI

摘要

生成满足多种理化特性精确数值约束的分子至关重要且充满挑战。尽管大语言模型(LLM)具有强表达能力,但在缺乏外部结构和反馈的情况下,它们难以实现精确的多目标控制和数值推理。我们提出M olGen——一个基于分子片段、检索增强的双阶段多属性约束分子生成框架。第一阶段:原型生成,多智能体推理器执行检索锚定的片段级编辑,生成接近可行域的候选分子;第二阶段:基于强化学习的细粒度优化,通过群组相对策略优化(GRPO)训练的片段级优化器实施单跳或多跳优化,在调控编辑复杂度与原型偏离度的同时,显式最小化目标属性误差。支撑这两个阶段的是一个大型自动构建的数据集,其中包含片段编辑的推理链及实测属性增量,实现了确定性、可复现的监督控制与可控多跳推理。与现有研究不同,本框架通过利用分子片段实现了更优的分子推理能力,并支持针对数值目标的可控优化。在两组属性约束(QED、LogP、分子量以及HOMO、LUMO)下的生成实验表明,该方法在分子有效性和多属性目标精确满足度上均取得稳定提升,性能优于主流大语言模型与基于图结构的算法。
English
Generating molecules that satisfy precise numeric constraints over multiple physicochemical properties is critical and challenging. Although large language models (LLMs) are expressive, they struggle with precise multi-objective control and numeric reasoning without external structure and feedback. We introduce M olGen, a fragment-level, retrieval-augmented, two-stage framework for molecule generation under multi-property constraints. Stage I : Prototype generation: a multi-agent reasoner performs retrieval-anchored, fragment-level edits to produce a candidate near the feasible region. Stage II : RL-based fine-grained optimization: a fragment-level optimizer trained with Group Relative Policy Optimization (GRPO) applies one- or multi-hop refinements to explicitly minimize the property errors toward our target while regulating edit complexity and deviation from the prototype. A large, automatically curated dataset with reasoning chains of fragment edits and measured property deltas underpins both stages, enabling deterministic, reproducible supervision and controllable multi-hop reasoning. Unlike prior work, our framework better reasons about molecules by leveraging fragments and supports controllable refinement toward numeric targets. Experiments on generation under two sets of property constraints (QED, LogP, Molecular Weight and HOMO, LUMO) show consistent gains in validity and precise satisfaction of multi-property targets, outperforming strong LLMs and graph-based algorithms.
PDF91January 17, 2026