DiffSpectra：基於擴散模型的光譜分子結構解析

摘要

從光譜解析分子結構是化學中的一個基礎性問題，對於化合物識別、合成和藥物開發具有深遠影響。傳統方法嚴重依賴專家解釋，且缺乏可擴展性。開創性的機器學習方法引入了基於檢索的策略，但其對有限庫的依賴限制了對新分子的泛化能力。生成模型提供了一個有前景的替代方案，然而大多數採用基於SMILES的自回歸架構，忽略了三維幾何結構，並且難以整合多樣的光譜模態。在本研究中，我們提出了DiffSpectra，這是一個利用擴散模型直接從多模態光譜數據推斷二維和三維分子結構的生成框架。DiffSpectra將結構解析公式化為一個條件生成過程。其去噪網絡由Diffusion Molecule Transformer參數化，這是一個整合了拓撲和幾何信息的SE(3)-等變架構。條件信息由SpecFormer提供，這是一個基於Transformer的光譜編碼器，能夠捕捉多模態光譜中的光譜內和光譜間依賴關係。大量實驗表明，DiffSpectra在結構解析中達到了高精度，通過採樣恢復了16.01%的top-1準確率和96.86%的top-20準確率的精確結構。該模型顯著受益於三維幾何建模、SpecFormer預訓練和多模態條件化。這些結果凸顯了基於光譜條件化的擴散建模在應對分子結構解析挑戰中的有效性。據我們所知，DiffSpectra是首個統一多模態光譜推理與二維/三維聯合生成建模的框架，用於從頭分子結構解析。

English

Molecular structure elucidation from spectra is a foundational problem in chemistry, with profound implications for compound identification, synthesis, and drug development. Traditional methods rely heavily on expert interpretation and lack scalability. Pioneering machine learning methods have introduced retrieval-based strategies, but their reliance on finite libraries limits generalization to novel molecules. Generative models offer a promising alternative, yet most adopt autoregressive SMILES-based architectures that overlook 3D geometry and struggle to integrate diverse spectral modalities. In this work, we present DiffSpectra, a generative framework that directly infers both 2D and 3D molecular structures from multi-modal spectral data using diffusion models. DiffSpectra formulates structure elucidation as a conditional generation process. Its denoising network is parameterized by Diffusion Molecule Transformer, an SE(3)-equivariant architecture that integrates topological and geometric information. Conditioning is provided by SpecFormer, a transformer-based spectral encoder that captures intra- and inter-spectral dependencies from multi-modal spectra. Extensive experiments demonstrate that DiffSpectra achieves high accuracy in structure elucidation, recovering exact structures with 16.01% top-1 accuracy and 96.86% top-20 accuracy through sampling. The model benefits significantly from 3D geometric modeling, SpecFormer pre-training, and multi-modal conditioning. These results highlight the effectiveness of spectrum-conditioned diffusion modeling in addressing the challenge of molecular structure elucidation. To our knowledge, DiffSpectra is the first framework to unify multi-modal spectral reasoning and joint 2D/3D generative modeling for de novo molecular structure elucidation.

DiffSpectra：基於擴散模型的光譜分子結構解析

DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models

摘要

Support