AutoMix：自動混合語言模型

摘要

大型語言模型（LLMs）現在在各種尺寸和配置上可從雲API提供商處獲得。儘管這種多樣性提供了廣泛的選擇，但有效利用這些選項以優化計算成本和性能仍然具有挑戰性。在這項工作中，我們提出了AutoMix，一種策略性地將查詢路由到較大的LM的方法，該方法基於從較小的LM輸出的近似正確性。AutoMix的核心是一種少量自我驗證機制，該機制估計其自身輸出的可靠性，而無需進行訓練。鑒於驗證可能存在噪音，我們在AutoMix中使用了一個元驗證器來提高這些評估的準確性。我們在五個基於上下文推理數據集上使用LLAMA2-13/70B進行的實驗表明，AutoMix超越了已建立的基準線，將每單位成本的增量效益提高了高達89％。我們的代碼和數據可在https://github.com/automix-llm/automix 上找到。

English

Large language models (LLMs) are now available in various sizes and configurations from cloud API providers. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix is a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring training. Given that verifications can be noisy, we employ a meta verifier in AutoMix to refine the accuracy of these assessments. Our experiments using LLAMA2-13/70B, on five context-grounded reasoning datasets demonstrate that AutoMix surpasses established baselines, improving the incremental benefit per cost by up to 89%. Our code and data are available at https://github.com/automix-llm/automix.

AutoMix：自動混合語言模型

AutoMix: Automatically Mixing Language Models

摘要

Support