ChatPaper.aiChatPaper

Atla Selene Mini:一個通用目的的評估模型

Atla Selene Mini: A General Purpose Evaluation Model

January 27, 2025
作者: Andrei Alexandru, Antonia Calvi, Henry Broomfield, Jackson Golden, Kyle Dai, Mathias Leys, Maurice Burger, Max Bartolo, Roman Engeler, Sashank Pisupati, Toby Drane, Young Sun Park
cs.AI

摘要

我們介紹了 Atla Selene Mini,一款最先進的小型語言模型作為評判(SLMJ)。Selene Mini 是一個通用的評估器,在跨越絕對評分、分類和兩兩偏好任務的 11 個超出分佈基準上表現優於最佳的 SLMJs 和 GPT-4o-mini。它是在 RewardBench 上得分最高的 8B 生成模型,超越了像 GPT-4o 和專門的評判這樣的強基準。為了實現這一點,我們開發了一個合理的數據精選策略,通過合成生成的評論來擴充公共數據集,並通過過濾和數據集刪除來確保高質量。我們在結合了直接偏好優化(DPO)和監督微調(SFT)損失的訓練下,培養出一個高度可提示的評估器,在現實情境中表現出色。Selene Mini 在金融和醫療行業數據集上與人類專家評估的零-shot一致性顯著提高。它也對提示格式的變化具有韌性。初步結果表明,Selene Mini 是一個在現場、由社區驅動的評判競技場中排名最高的評估器。我們在 HuggingFace(https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B)和 Ollama 上釋出模型權重,以鼓勵廣泛的社區採用。
English
We introduce Atla Selene Mini, a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that outperforms the best SLMJs and GPT-4o-mini on overall performance across 11 out-of-distribution benchmarks, spanning absolute scoring, classification, and pairwise preference tasks. It is the highest-scoring 8B generative model on RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To achieve this, we develop a principled data curation strategy that augments public datasets with synthetically generated critiques and ensures high quality through filtering and dataset ablations. We train our model on a combined direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and produce a highly promptable evaluator that excels in real-world scenarios. Selene Mini shows dramatically improved zero-shot agreement with human expert evaluations on financial and medical industry datasets. It is also robust to variations in prompt format. Preliminary results indicate that Selene Mini is the top-ranking evaluator in a live, community-driven Judge Arena. We release the model weights on HuggingFace (https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage widespread community adoption.

Summary

AI-Generated Summary

PDF364January 30, 2025