COSMOS: 예측 가능하고 비용 효율적인 대규모 언어 모델 적응

초록

대규모 언어 모델(LLMs)은 다양한 적응 전략을 사용하여 수많은 작업에서 뛰어난 성능을 달성합니다. 그러나 자원 제약 하에서 모델과 적응 전략을 최적으로 선택하는 것은 어려운 문제이며, 종종 광범위한 실험을 필요로 합니다. 우리는 비용이 많이 드는 실험 없이도 성능과 비용을 정확하게 예측할 수 있는지 조사합니다. 우리는 LLM을 위한 전략 선택 문제를 공식화하고, 최소 비용으로 적응 결과를 효율적으로 추정하는 통합 예측 프레임워크인 COSMOS를 소개합니다. 우리는 이 프레임워크의 능력을 두 가지 강력한 예측기를 통해 구체화하고 연구합니다: 파인튜닝 성능을 예측하기 위한 임베딩-증강 경량 프록시 모델과, 검색-증강 인컨텍스트 학습을 예측하기 위한 저샘플링 스케일링 법칙입니다. 8개의 대표적인 벤치마크에 걸친 광범위한 평가 결과, COSMOS는 평균 92.72%, 자원 집약적인 시나리오에서는 최대 98.71%까지 계산 비용을 줄이면서도 높은 예측 정확도를 달성함을 보여줍니다. 우리의 결과는 적응 결과를 효율적으로 예측하는 것이 가능할 뿐만 아니라, 성능 기준을 유지하면서 LLM 배포의 계산 오버헤드를 상당히 줄일 수 있음을 보여줍니다.

English

Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies. However, optimally selecting a model and adaptation strategy under resource constraints is challenging and often requires extensive experimentation. We investigate whether it is possible to accurately predict both performance and cost without expensive trials. We formalize the strategy selection problem for LLMs and introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost. We instantiate and study the capability of our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Extensive evaluation across eight representative benchmarks demonstrates that COSMOS achieves high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is not only feasible but can substantially reduce the computational overhead of LLM deployment while maintaining performance standards.

COSMOS: 예측 가능하고 비용 효율적인 대규모 언어 모델 적응

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

초록

Support