微調整された小型言語モデルによるリガンド-タンパク質相互作用親和性の高精度予測

要旨

リガンド-タンパク質相互作用（LPI）の親和性、すなわち薬物標的相互作用（DTI）の正確な予測について、指示ファインチューニングされた事前学習済み生成型小規模言語モデル（SLMs）を用いて説明する。我々は、ゼロショット設定において、サンプル外データに対するリガンド-タンパク質相互作用に関連する幅広い親和性値の正確な予測を達成した。モデル入力として使用されたのは、リガンドのSMILES文字列とタンパク質のアミノ酸配列のみであった。我々の結果は、機械学習（ML）や自由エネルギー摂動（FEP+）ベースの手法を上回り、リガンド-タンパク質相互作用の親和性を正確に予測する能力を示しており、これは困難な治療標的に対する創薬キャンペーンのさらなる加速に活用できる。

English

We describe the accurate prediction of ligand-protein interaction (LPI) affinities, also known as drug-target interactions (DTI), with instruction fine-tuned pretrained generative small language models (SLMs). We achieved accurate predictions for a range of affinity values associated with ligand-protein interactions on out-of-sample data in a zero-shot setting. Only the SMILES string of the ligand and the amino acid sequence of the protein were used as the model inputs. Our results demonstrate a clear improvement over machine learning (ML) and free-energy perturbation (FEP+) based methods in accurately predicting a range of ligand-protein interaction affinities, which can be leveraged to further accelerate drug discovery campaigns against challenging therapeutic targets.

微調整された小型言語モデルによるリガンド-タンパク質相互作用親和性の高精度予測

Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models

要旨

Support