ChatPaper.aiChatPaper

数据顾问:用于大型语言模型安全对齐的动态数据整理

Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

October 7, 2024
作者: Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
cs.AI

摘要

数据是大型语言模型(LLM)对齐中的关键要素。最近的研究探讨了使用LLM进行高效数据收集的方法。然而,LLM生成的数据通常存在质量问题,包括代表性不足或缺失的方面以及低质量的数据点。为了解决这些问题,我们提出了Data Advisor,这是一种增强型基于LLM的数据生成方法,考虑了所需数据集的特征。从一组预定义的原则出发,Data Advisor监控生成数据的状态,识别当前数据集中的弱点,并相应地建议下一轮数据生成。Data Advisor可以轻松集成到现有的数据生成方法中,以提高数据质量和覆盖范围。对三个代表性LLM(即Mistral、Llama2和Falcon)的安全对齐实验表明,Data Advisor在增强模型安全性方面的有效性,能够应对各种细粒度安全问题,而不会牺牲模型效用。
English
Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepresented or absent aspects and low-quality datapoints. To address these problems, we propose Data Advisor, an enhanced LLM-based method for generating data that takes into account the characteristics of the desired dataset. Starting from a set of pre-defined principles in hand, Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation accordingly. Data Advisor can be easily integrated into existing data generation methods to enhance data quality and coverage. Experiments on safety alignment of three representative LLMs (i.e., Mistral, Llama2, and Falcon) demonstrate the effectiveness of Data Advisor in enhancing model safety against various fine-grained safety issues without sacrificing model utility.

Summary

AI-Generated Summary

PDF32November 16, 2024