AutoMix: 言語モデルの自動混合

要旨

大規模言語モデル（LLM）は現在、クラウドAPIプロバイダーから様々なサイズと構成で提供されています。この多様性は幅広い選択肢を提供しますが、計算コストとパフォーマンスを最適化するためにこれらのオプションを効果的に活用することは依然として困難です。本研究では、AutoMixというアプローチを提案します。これは、より小さなLLMからの出力の近似正しさに基づいて、クエリをより大きなLLMに戦略的にルーティングするものです。AutoMixの中核となるのは、トレーニングを必要とせずに自身の出力の信頼性を推定する少数ショットの自己検証メカニズムです。検証がノイズを含む可能性があることを考慮し、AutoMixではメタ検証器を採用してこれらの評価の精度を向上させます。LLAMA2-13/70Bを使用した5つの文脈に基づく推論データセットでの実験により、AutoMixが既存のベースラインを上回り、コストあたりの増分利益を最大89％向上させることが実証されました。私たちのコードとデータはhttps://github.com/automix-llm/automixで公開されています。

English

Large language models (LLMs) are now available in various sizes and configurations from cloud API providers. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix is a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring training. Given that verifications can be noisy, we employ a meta verifier in AutoMix to refine the accuracy of these assessments. Our experiments using LLAMA2-13/70B, on five context-grounded reasoning datasets demonstrate that AutoMix surpasses established baselines, improving the incremental benefit per cost by up to 89%. Our code and data are available at https://github.com/automix-llm/automix.

AutoMix: 言語モデルの自動混合

AutoMix: Automatically Mixing Language Models

要旨

Support