ChatPaper.aiChatPaper

MolHIT:基于分层离散扩散模型的分子图生成技术新突破

MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models

February 19, 2026
作者: Hojung Jung, Rodrigo Hormazabal, Jaehyeong Jo, Youngrok Park, Kyunggeun Roh, Se-Young Yun, Sehui Han, Dae-Woong Jeong
cs.AI

摘要

扩散模型驱动的分子生成已成为AI药物发现和材料科学的重要方向。尽管二维分子图的离散特性使图扩散模型被广泛采用,但现有模型存在化学有效性低的问题,且在一维建模对比中难以满足目标属性要求。本研究提出MolHIT——一个突破现有方法性能瓶颈的分子图生成框架。该框架基于分层离散扩散模型,将离散扩散推广至编码化学先验的附加类别,并采用解耦原子编码技术根据原子化学作用进行类型划分。在MOSES数据集上,MolHIT首次实现图扩散模型接近完美的化学有效性,创下多项指标的新纪录,显著超越强效一维基线模型。我们进一步验证了其在下游任务中的卓越表现,包括多属性引导生成和骨架扩展应用。
English
Molecular generation with diffusion models has emerged as a promising direction for AI-driven drug discovery and materials science. While graph diffusion models have been widely adopted due to the discrete nature of 2D molecular graphs, existing models suffer from low chemical validity and struggle to meet the desired properties compared to 1D modeling. In this work, we introduce MolHIT, a powerful molecular graph generation framework that overcomes long-standing performance limitations in existing methods. MolHIT is based on the Hierarchical Discrete Diffusion Model, which generalizes discrete diffusion to additional categories that encode chemical priors, and decoupled atom encoding that splits the atom types according to their chemical roles. Overall, MolHIT achieves new state-of-the-art performance on the MOSES dataset with near-perfect validity for the first time in graph diffusion, surpassing strong 1D baselines across multiple metrics. We further demonstrate strong performance in downstream tasks, including multi-property guided generation and scaffold extension.
PDF492February 27, 2026