走向语言模型中的三维分子文本解释
Towards 3D Molecule-Text Interpretation in Language Models
January 25, 2024
作者: Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian
cs.AI
摘要
语言模型(LMs)在各个领域产生了巨大影响。然而,它们在理解3D分子结构方面的固有局限性显著限制了它们在生物分子领域的潜力。为了弥合这一差距,我们专注于3D分子-文本解释,并提出3D-MoLM:3D-分子语言建模。具体而言,3D-MoLM通过为LM配备3D分子编码器,使LM能够解释和分析3D分子。这种集成是通过3D分子-文本投影仪实现的,它连接了3D分子编码器的表示空间和LM的输入空间。此外,为了增强3D-MoLM对跨模态分子理解和指导遵循的能力,我们精心策划了一个3D分子为中心的指导调整数据集 - 3D-MoIT。通过3D分子-文本对齐和3D分子为中心的指导调整,3D-MoLM建立了3D分子编码器和LM的集成。它在下游任务中显著超越了现有基线,包括分子-文本检索、分子字幕生成,以及更具挑战性的开放文本分子问答任务,特别侧重于3D相关属性。
English
Language Models (LMs) have greatly influenced diverse domains. However, their
inherent limitation in comprehending 3D molecular structures has considerably
constrained their potential in the biomolecular domain. To bridge this gap, we
focus on 3D molecule-text interpretation, and propose 3D-MoLM: 3D-Molecular
Language Modeling. Specifically, 3D-MoLM enables an LM to interpret and analyze
3D molecules by equipping the LM with a 3D molecular encoder. This integration
is achieved by a 3D molecule-text projector, bridging the 3D molecular
encoder's representation space and the LM's input space. Moreover, to enhance
3D-MoLM's ability of cross-modal molecular understanding and instruction
following, we meticulously curated a 3D molecule-centric instruction tuning
dataset -- 3D-MoIT. Through 3D molecule-text alignment and 3D molecule-centric
instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder
and LM. It significantly surpasses existing baselines on downstream tasks,
including molecule-text retrieval, molecule captioning, and more challenging
open-text molecular QA tasks, especially focusing on 3D-dependent properties.Summary
AI-Generated Summary