ChatPaper.aiChatPaper

COSMOS:大型語言模型的可預測且成本效益的適應性調整

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

April 30, 2025
作者: Jiayu Wang, Aws Albarghouthi, Frederic Sala
cs.AI

摘要

大型語言模型(LLMs)通過採用多樣化的適應策略,在多項任務中展現出卓越的性能。然而,在資源受限的情況下,如何最佳選擇模型及適應策略是一大挑戰,通常需要進行大量實驗。本研究探討是否能在不進行昂貴試驗的情況下,準確預測性能與成本。我們將LLMs的策略選擇問題形式化,並引入COSMOS,這是一個統一預測框架,能夠以最小成本高效估算適應結果。我們通過一對強大的預測器來實例化並研究該框架的能力:嵌入增強的輕量級代理模型用於預測微調性能,以及低樣本縮放法則用於預測檢索增強的情境學習。在八個代表性基準上的廣泛評估表明,COSMOS在保持高預測準確性的同時,平均降低了92.72%的計算成本,在資源密集型場景中最高可達98.71%。我們的結果顯示,高效預測適應結果不僅可行,而且能在保持性能標準的同時,大幅減少LLM部署的計算開銷。
English
Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies. However, optimally selecting a model and adaptation strategy under resource constraints is challenging and often requires extensive experimentation. We investigate whether it is possible to accurately predict both performance and cost without expensive trials. We formalize the strategy selection problem for LLMs and introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost. We instantiate and study the capability of our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Extensive evaluation across eight representative benchmarks demonstrates that COSMOS achieves high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is not only feasible but can substantially reduce the computational overhead of LLM deployment while maintaining performance standards.
PDF31May 8, 2025