COSMOS: Voorspelbare en Kosteneffectieve Aanpassing van LLM's

Samenvatting

Grote taalmodellen (LLMs) behalen opmerkelijke prestaties op tal van taken door gebruik te maken van een diverse reeks aanpassingsstrategieën. Het optimaal selecteren van een model en aanpassingsstrategie onder beperkte middelen is echter uitdagend en vereist vaak uitgebreid experimenteren. Wij onderzoeken of het mogelijk is om zowel prestaties als kosten nauwkeurig te voorspellen zonder dure proeven. We formaliseren het strategiekeuzeprobleem voor LLMs en introduceren COSMOS, een uniform voorspellingsraamwerk dat efficiënt de uitkomsten van aanpassingen schat tegen minimale kosten. We concretiseren en bestuderen de capaciteit van ons raamwerk via een tweetal krachtige voorspellers: embedding-augmented lichtgewicht proxy-modellen om de prestaties van fine-tuning te voorspellen, en schaalwetten met weinig steekproeven om retrieval-augmented in-context learning te voorspellen. Uitgebreide evaluatie over acht representatieve benchmarks toont aan dat COSMOS een hoge voorspellingsnauwkeurigheid bereikt terwijl de rekenkosten gemiddeld met 92,72% worden verlaagd, en tot wel 98,71% in resource-intensieve scenario's. Onze resultaten laten zien dat efficiënte voorspelling van aanpassingsuitkomsten niet alleen haalbaar is, maar ook de rekenoverhead van LLM-implementatie aanzienlijk kan verminderen terwijl de prestatiestandaarden worden gehandhaafd.

English

Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies. However, optimally selecting a model and adaptation strategy under resource constraints is challenging and often requires extensive experimentation. We investigate whether it is possible to accurately predict both performance and cost without expensive trials. We formalize the strategy selection problem for LLMs and introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost. We instantiate and study the capability of our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Extensive evaluation across eight representative benchmarks demonstrates that COSMOS achieves high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is not only feasible but can substantially reduce the computational overhead of LLM deployment while maintaining performance standards.

COSMOS: Voorspelbare en Kosteneffectieve Aanpassing van LLM's

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

Samenvatting

Support