ChatPaper.aiChatPaper

MolSpectra:基於多模態能量譜的3D分子表徵預訓練

MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

February 22, 2025
作者: Liang Wang, Shaozhen Liu, Yu Rong, Deli Zhao, Qiang Liu, Shu Wu, Liang Wang
cs.AI

摘要

建立三維結構與分子系統能量狀態之間的關係已被證明是學習三維分子表徵的一種有前景的方法。然而,現有方法僅限於從經典力學的角度建模分子能量狀態。這一限制導致了對量子力學效應(如量化(離散)能級結構)的顯著忽視,而這些效應能提供更精確的分子能量估計,並可通過能譜進行實驗測量。在本文中,我們提出利用能譜來增強三維分子表徵的預訓練(MolSpectra),從而將量子力學知識融入分子表徵中。具體而言,我們提出了SpecFormer,這是一種通過掩碼片段重建來編碼分子譜的多譜編碼器。通過進一步使用對比目標對齊三維編碼器和譜編碼器的輸出,我們增強了三維編碼器對分子的理解。在公開基準上的評估表明,我們的預訓練表徵在預測分子特性和建模動力學方面超越了現有方法。
English
Establishing the relationship between 3D structures and the energy states of molecular systems has proven to be a promising approach for learning 3D molecular representations. However, existing methods are limited to modeling the molecular energy states from classical mechanics. This limitation results in a significant oversight of quantum mechanical effects, such as quantized (discrete) energy level structures, which offer a more accurate estimation of molecular energy and can be experimentally measured through energy spectra. In this paper, we propose to utilize the energy spectra to enhance the pre-training of 3D molecular representations (MolSpectra), thereby infusing the knowledge of quantum mechanics into the molecular representations. Specifically, we propose SpecFormer, a multi-spectrum encoder for encoding molecular spectra via masked patch reconstruction. By further aligning outputs from the 3D encoder and spectrum encoder using a contrastive objective, we enhance the 3D encoder's understanding of molecules. Evaluations on public benchmarks reveal that our pre-trained representations surpass existing methods in predicting molecular properties and modeling dynamics.

Summary

AI-Generated Summary

PDF62February 27, 2025