資料顧問:針對大型語言模型的安全對齊進行動態資料編輯
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
October 7, 2024
作者: Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
cs.AI
摘要
在大型語言模型(LLM)對齊中,數據是一個至關重要的元素。最近的研究已探索使用LLM進行高效數據收集。然而,LLM生成的數據往往存在質量問題,包括欠代表或缺失的方面以及低質量的數據點。為解決這些問題,我們提出了Data Advisor,這是一種增強型基於LLM的數據生成方法,考慮了所需數據集的特徵。從一組預定義的原則出發,Data Advisor監控生成數據的狀態,識別當前數據集中的弱點,並相應地建議下一輪數據生成。Data Advisor可以輕鬆集成到現有的數據生成方法中,以增強數據質量和覆蓋範圍。對三個代表性LLM(即Mistral、Llama2和Falcon)的安全對齊進行的實驗證明了Data Advisor在增強模型安全性方面的有效性,可以抵禦各種細粒度安全問題,同時不損害模型效用。
English
Data is a crucial element in large language model (LLM) alignment. Recent
studies have explored using LLMs for efficient data collection. However,
LLM-generated data often suffers from quality issues, with underrepresented or
absent aspects and low-quality datapoints. To address these problems, we
propose Data Advisor, an enhanced LLM-based method for generating data that
takes into account the characteristics of the desired dataset. Starting from a
set of pre-defined principles in hand, Data Advisor monitors the status of the
generated data, identifies weaknesses in the current dataset, and advises the
next iteration of data generation accordingly. Data Advisor can be easily
integrated into existing data generation methods to enhance data quality and
coverage. Experiments on safety alignment of three representative LLMs (i.e.,
Mistral, Llama2, and Falcon) demonstrate the effectiveness of Data Advisor in
enhancing model safety against various fine-grained safety issues without
sacrificing model utility.Summary
AI-Generated Summary