ProteinBench: タンパク質基礎モデルの包括的評価

要旨

近年、タンパク質基盤モデルの開発が急速に進み、タンパク質予測や3D構造予測、タンパク質設計、構造力学などの生成タスクにおける性能が著しく向上しています。しかしながら、これらのモデルに関連する能力と制限は、統一された評価フレームワークの欠如により不明瞭なままです。このギャップを埋めるために、我々はProteinBenchを導入し、タンパク質基盤モデルの透明性を向上させるための包括的な評価フレームワークを設計しました。当該アプローチには、以下の3つの主要な要素が含まれます：(i) 異なるタンパク質モダリティ間の関係に基づく、タンパク質領域における主要な課題を広く包括するタスクの分類、(ii) 品質、新規性、多様性、堅牢性の4つの主要次元にわたる性能評価を行う多指標評価アプローチ、および(iii) ユーザー目的に基づくさまざまな詳細な分析により、モデルの性能を包括的に把握します。タンパク質基盤モデルの包括的な評価により、現在の能力と制限に関するいくつかの重要な知見が明らかになりました。透明性を促進し、さらなる研究を支援するために、評価データセット、コード、および一般的なモジュール化ツールキットを一般に公開することで、ProteinBenchを生きたベンチマークとし、タンパク質基盤モデルの標準化された詳細な評価フレームワークを確立し、その開発と応用を促進し、分野内の協力を促進することを意図しています。

English

Recent years have witnessed a surge in the development of protein foundation models, significantly improving performance in protein prediction and generative tasks ranging from 3D structure prediction and protein design to conformational dynamics. However, the capabilities and limitations associated with these models remain poorly understood due to the absence of a unified evaluation framework. To fill this gap, we introduce ProteinBench, a holistic evaluation framework designed to enhance the transparency of protein foundation models. Our approach consists of three key components: (i) A taxonomic classification of tasks that broadly encompass the main challenges in the protein domain, based on the relationships between different protein modalities; (ii) A multi-metric evaluation approach that assesses performance across four key dimensions: quality, novelty, diversity, and robustness; and (iii) In-depth analyses from various user objectives, providing a holistic view of model performance. Our comprehensive evaluation of protein foundation models reveals several key findings that shed light on their current capabilities and limitations. To promote transparency and facilitate further research, we release the evaluation dataset, code, and a public leaderboard publicly for further analysis and a general modular toolkit. We intend for ProteinBench to be a living benchmark for establishing a standardized, in-depth evaluation framework for protein foundation models, driving their development and application while fostering collaboration within the field.

ProteinBench: タンパク質基礎モデルの包括的評価

ProteinBench: A Holistic Evaluation of Protein Foundation Models

要旨

Support