資料顧問：針對大型語言模型的安全對齊進行動態資料編輯

摘要

在大型語言模型（LLM）對齊中，數據是一個至關重要的元素。最近的研究已探索使用LLM進行高效數據收集。然而，LLM生成的數據往往存在質量問題，包括欠代表或缺失的方面以及低質量的數據點。為解決這些問題，我們提出了Data Advisor，這是一種增強型基於LLM的數據生成方法，考慮了所需數據集的特徵。從一組預定義的原則出發，Data Advisor監控生成數據的狀態，識別當前數據集中的弱點，並相應地建議下一輪數據生成。Data Advisor可以輕鬆集成到現有的數據生成方法中，以增強數據質量和覆蓋範圍。對三個代表性LLM（即Mistral、Llama2和Falcon）的安全對齊進行的實驗證明了Data Advisor在增強模型安全性方面的有效性，可以抵禦各種細粒度安全問題，同時不損害模型效用。

English

Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepresented or absent aspects and low-quality datapoints. To address these problems, we propose Data Advisor, an enhanced LLM-based method for generating data that takes into account the characteristics of the desired dataset. Starting from a set of pre-defined principles in hand, Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation accordingly. Data Advisor can be easily integrated into existing data generation methods to enhance data quality and coverage. Experiments on safety alignment of three representative LLMs (i.e., Mistral, Llama2, and Falcon) demonstrate the effectiveness of Data Advisor in enhancing model safety against various fine-grained safety issues without sacrificing model utility.

資料顧問：針對大型語言模型的安全對齊進行動態資料編輯

Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

摘要

Support