COSMOS: LLMの予測可能かつコスト効率的な適応

要旨

大規模言語モデル（LLM）は、多様な適応戦略を用いることで、数多くのタスクで顕著な性能を達成しています。しかし、リソース制約下で最適なモデルと適応戦略を選択することは困難であり、しばしば大規模な実験を必要とします。本研究では、高コストな試行を伴わずに、性能とコストを正確に予測することが可能かどうかを調査します。我々はLLMの戦略選択問題を形式化し、最小限のコストで適応結果を効率的に推定する統一予測フレームワークであるCOSMOSを導入します。このフレームワークの能力を、強力な2つの予測器を通じて具体化し、検証します。すなわち、ファインチューニング性能を予測するための埋め込み拡張軽量プロキシモデルと、検索拡張インコンテキスト学習を予測するための低サンプルスケーリング則です。8つの代表的なベンチマークでの広範な評価により、COSMOSが高い予測精度を達成しつつ、平均で92.72%、リソース集約的なシナリオでは最大98.71%の計算コストを削減できることが示されました。我々の結果は、適応結果の効率的な予測が可能であるだけでなく、LLMのデプロイメントにおける計算オーバーヘッドを大幅に削減しつつ、性能基準を維持できることを示しています。

English

Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies. However, optimally selecting a model and adaptation strategy under resource constraints is challenging and often requires extensive experimentation. We investigate whether it is possible to accurately predict both performance and cost without expensive trials. We formalize the strategy selection problem for LLMs and introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost. We instantiate and study the capability of our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Extensive evaluation across eight representative benchmarks demonstrates that COSMOS achieves high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is not only feasible but can substantially reduce the computational overhead of LLM deployment while maintaining performance standards.

COSMOS: LLMの予測可能かつコスト効率的な適応

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

要旨

Support