COSMOS:大语言模型的可预测且经济高效的适配方案
COSMOS: Predictable and Cost-Effective Adaptation of LLMs
April 30, 2025
作者: Jiayu Wang, Aws Albarghouthi, Frederic Sala
cs.AI
摘要
大型语言模型(LLMs)通过采用多样化的适应策略,在众多任务中展现出卓越性能。然而,在资源受限的情况下,如何最优选择模型及适应策略颇具挑战,往往需要大量实验验证。本研究探讨了是否能在不进行昂贵试验的情况下,准确预测性能与成本。我们将LLM的策略选择问题形式化,并引入了COSMOS这一统一预测框架,该框架能够以最小成本高效估算适应结果。我们通过一对强大的预测器实例化并研究了该框架的能力:嵌入增强的轻量级代理模型用于预测微调性能,以及低样本扩展法则用于预测检索增强的上下文学习。在八个代表性基准上的广泛评估表明,COSMOS在平均降低92.72%计算成本的同时,实现了高预测精度,在资源密集型场景下最高可降低98.71%。我们的结果表明,高效预测适应结果不仅可行,还能在保持性能标准的同时,显著减少LLM部署的计算开销。
English
Large language models (LLMs) achieve remarkable performance across numerous
tasks by using a diverse array of adaptation strategies. However, optimally
selecting a model and adaptation strategy under resource constraints is
challenging and often requires extensive experimentation. We investigate
whether it is possible to accurately predict both performance and cost without
expensive trials. We formalize the strategy selection problem for LLMs and
introduce COSMOS, a unified prediction framework that efficiently estimates
adaptation outcomes at minimal cost. We instantiate and study the capability of
our framework via a pair of powerful predictors: embedding-augmented
lightweight proxy models to predict fine-tuning performance, and low-sample
scaling laws to forecast retrieval-augmented in-context learning. Extensive
evaluation across eight representative benchmarks demonstrates that COSMOS
achieves high prediction accuracy while reducing computational costs by 92.72%
on average, and up to 98.71% in resource-intensive scenarios. Our results show
that efficient prediction of adaptation outcomes is not only feasible but can
substantially reduce the computational overhead of LLM deployment while
maintaining performance standards.Summary
AI-Generated Summary