利用微調的小型語言模型準確預測配體-蛋白質相互作用親和力
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
June 27, 2024
作者: Ben Fauber
cs.AI
摘要
我們描述了使用微調預訓練生成式小語言模型(SLM)準確預測配體-蛋白質相互作用(LPI)親和力,也被稱為藥物-靶標相互作用(DTI)。我們在零樣本設置中對與配體-蛋白質相互作用相關的一系列親和力值在測試集數據上實現了準確預測。模型的輸入僅使用了配體的SMILES字符串和蛋白質的氨基酸序列。我們的結果顯示,在準確預測一系列配體-蛋白質相互作用親和力方面,相較於基於機器學習(ML)和自由能變化(FEP+)的方法,有明顯的改善,這可以用來進一步加速針對具有挑戰性治療靶點的藥物發現活動。
English
We describe the accurate prediction of ligand-protein interaction (LPI)
affinities, also known as drug-target interactions (DTI), with instruction
fine-tuned pretrained generative small language models (SLMs). We achieved
accurate predictions for a range of affinity values associated with
ligand-protein interactions on out-of-sample data in a zero-shot setting. Only
the SMILES string of the ligand and the amino acid sequence of the protein were
used as the model inputs. Our results demonstrate a clear improvement over
machine learning (ML) and free-energy perturbation (FEP+) based methods in
accurately predicting a range of ligand-protein interaction affinities, which
can be leveraged to further accelerate drug discovery campaigns against
challenging therapeutic targets.Summary
AI-Generated Summary