利用经过微调的小型语言模型准确预测配体-蛋白质相互作用亲和力
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
June 27, 2024
作者: Ben Fauber
cs.AI
摘要
我们描述了使用微调预训练生成式小语言模型(SLMs)准确预测配体-蛋白相互作用(LPI)亲和力的方法,也被称为药物-靶标相互作用(DTI)。我们在零样本设置中针对与配体-蛋白相互作用相关的一系列亲和力数值实现了准确预测。模型的输入仅为配体的SMILES字符串和蛋白的氨基酸序列。我们的结果表明,在准确预测一系列配体-蛋白相互作用亲和力方面,与基于机器学习(ML)和自由能扰动(FEP+)的方法相比,实现了明显的改进,这可以进一步加速针对具有挑战性治疗靶点的药物发现活动。
English
We describe the accurate prediction of ligand-protein interaction (LPI)
affinities, also known as drug-target interactions (DTI), with instruction
fine-tuned pretrained generative small language models (SLMs). We achieved
accurate predictions for a range of affinity values associated with
ligand-protein interactions on out-of-sample data in a zero-shot setting. Only
the SMILES string of the ligand and the amino acid sequence of the protein were
used as the model inputs. Our results demonstrate a clear improvement over
machine learning (ML) and free-energy perturbation (FEP+) based methods in
accurately predicting a range of ligand-protein interaction affinities, which
can be leveraged to further accelerate drug discovery campaigns against
challenging therapeutic targets.Summary
AI-Generated Summary