ProteinBench:蛋白质基础模型的整体评估
ProteinBench: A Holistic Evaluation of Protein Foundation Models
September 10, 2024
作者: Fei Ye, Zaixiang Zheng, Dongyu Xue, Yuning Shen, Lihao Wang, Yiming Ma, Yan Wang, Xinyou Wang, Xiangxin Zhou, Quanquan Gu
cs.AI
摘要
近年来,蛋白质基础模型的发展迅猛,显著提高了蛋白质预测和生成任务的性能,涵盖了从3D结构预测和蛋白设计到构象动力学等领域。然而,由于缺乏统一的评估框架,这些模型的能力和局限性仍然知之甚少。为了填补这一空白,我们引入了ProteinBench,这是一个旨在增强蛋白质基础模型透明度的全面评估框架。我们的方法包括三个关键组成部分:(i)对任务进行分类,广泛涵盖蛋白质领域的主要挑战,基于不同蛋白质模态之间的关系;(ii)采用多指标评估方法,评估性能在质量、新颖性、多样性和稳健性四个关键维度上的表现;以及(iii)从各种用户目标进行深入分析,提供模型性能的全面视角。我们对蛋白质基础模型进行了全面评估,揭示了几个关键发现,阐明了它们当前的能力和局限性。为了促进透明度并促进进一步研究,我们公开发布了评估数据集、代码和一个公开的排行榜,供进一步分析和一个通用的模块化工具包。我们希望ProteinBench成为一个活跃的基准,建立一个标准化、深入的蛋白质基础模型评估框架,推动其发展和应用,同时促进领域内的合作。
English
Recent years have witnessed a surge in the development of protein foundation
models, significantly improving performance in protein prediction and
generative tasks ranging from 3D structure prediction and protein design to
conformational dynamics. However, the capabilities and limitations associated
with these models remain poorly understood due to the absence of a unified
evaluation framework. To fill this gap, we introduce ProteinBench, a holistic
evaluation framework designed to enhance the transparency of protein foundation
models. Our approach consists of three key components: (i) A taxonomic
classification of tasks that broadly encompass the main challenges in the
protein domain, based on the relationships between different protein
modalities; (ii) A multi-metric evaluation approach that assesses performance
across four key dimensions: quality, novelty, diversity, and robustness; and
(iii) In-depth analyses from various user objectives, providing a holistic view
of model performance. Our comprehensive evaluation of protein foundation models
reveals several key findings that shed light on their current capabilities and
limitations. To promote transparency and facilitate further research, we
release the evaluation dataset, code, and a public leaderboard publicly for
further analysis and a general modular toolkit. We intend for ProteinBench to
be a living benchmark for establishing a standardized, in-depth evaluation
framework for protein foundation models, driving their development and
application while fostering collaboration within the field.Summary
AI-Generated Summary