CritiQ：從人類偏好中挖掘數據質量標準

摘要

語言模型的高度依賴於高品質數據以實現最佳性能。現有方法依賴於手動設計的啟發式方法、現有模型的困惑度、訓練分類器或精心的提示工程，這些方法需要大量的專家經驗和人工註釋工作，同時引入了偏差。我們提出了CritiQ，一種新穎的數據選擇方法，能夠僅使用30對人工註釋的樣本自動從人類偏好中挖掘數據質量標準，並進行高效的數據選擇。其主要組件CritiQ Flow採用管理代理來演化質量標準，並由工作代理進行成對判斷。我們構建了一個知識庫，從先前的工作中提取質量標準以增強CritiQ Flow。與基於困惑度和分類器的方法相比，語言標準更具可解釋性並具有可重用的價值。在推導出標準後，我們訓練CritiQ評分器來給出質量分數並進行高效的數據選擇。我們在代碼、數學和邏輯領域展示了該方法的有效性，在人工註釋的測試集上達到了高準確率。為了驗證所選數據的質量，我們持續訓練Llama 3.1模型，並觀察到在下游任務上的性能相比均勻採樣有所提升。消融研究驗證了知識庫和反思過程的益處。我們分析了標準如何演化以及多數投票的有效性。

English

Language model heavily depends on high-quality data for optimal performance. Existing approaches rely on manually designed heuristics, the perplexity of existing models, training classifiers, or careful prompt engineering, which require significant expert experience and human annotation effort while introduce biases. We introduce CritiQ, a novel data selection method that automatically mines criteria from human preferences for data quality with only sim30 human-annotated pairs and performs efficient data selection. The main component, CritiQ Flow, employs a manager agent to evolve quality criteria and worker agents to make pairwise judgments. We build a knowledge base that extracts quality criteria from previous work to boost CritiQ Flow. Compared to perplexity- and classifier- based methods, verbal criteria are more interpretable and possess reusable value. After deriving the criteria, we train the CritiQ Scorer to give quality scores and perform efficient data selection. We demonstrate the effectiveness of our method in the code, math, and logic domains, achieving high accuracy on human-annotated test sets. To validate the quality of the selected data, we continually train Llama 3.1 models and observe improved performance on downstream tasks compared to uniform sampling. Ablation studies validate the benefits of the knowledge base and the reflection process. We analyze how criteria evolve and the effectiveness of majority voting.

CritiQ：從人類偏好中挖掘數據質量標準

CritiQ: Mining Data Quality Criteria from Human Preferences

摘要

Support