데이터 자문가: 대규모 언어 모델의 안전 조정을 위한 동적 데이터 큐레이션

초록

대규모 언어 모델 (LLM) 정렬에서 데이터는 중요한 요소입니다. 최근 연구에서는 효율적인 데이터 수집을 위해 LLM을 활용하는 방법을 탐구했습니다. 그러나 LLM이 생성한 데이터는 종종 품질 문제를 겪어 원하는 측면이 누락되거나 표현되지 않고 품질이 낮은 데이터 포인트가 있습니다. 이러한 문제를 해결하기 위해 우리는 Data Advisor를 제안합니다. 이는 원하는 데이터셋의 특성을 고려하는 데이터 생성을 위한 향상된 LLM 기반 방법입니다. 사전에 정의된 원칙 세트에서 시작하여 Data Advisor는 생성된 데이터의 상태를 모니터링하고 현재 데이터셋의 약점을 식별하며 그에 따라 다음 데이터 생성 반복을 조언합니다. Data Advisor는 기존 데이터 생성 방법에 쉽게 통합되어 데이터 품질과 범위를 향상시킬 수 있습니다. Mistral, Llama2 및 Falcon과 같은 세 가지 대표적인 LLM의 안전 정렬 실험에서 Data Advisor의 효과를 입증하며 다양한 세밀한 안전 문제에 대항하여 모델 안전성을 향상시키는 데 모델 유틸리티를 희생하지 않습니다.

English

Data is a crucial element in large language model (LLM) alignment. Recent studies have explored using LLMs for efficient data collection. However, LLM-generated data often suffers from quality issues, with underrepresented or absent aspects and low-quality datapoints. To address these problems, we propose Data Advisor, an enhanced LLM-based method for generating data that takes into account the characteristics of the desired dataset. Starting from a set of pre-defined principles in hand, Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation accordingly. Data Advisor can be easily integrated into existing data generation methods to enhance data quality and coverage. Experiments on safety alignment of three representative LLMs (i.e., Mistral, Llama2, and Falcon) demonstrate the effectiveness of Data Advisor in enhancing model safety against various fine-grained safety issues without sacrificing model utility.