재구성된 정렬

초록

대규모 언어 모델(LLM)을 인간의 가치와 조율하기 위해서는 파인튜닝 데이터의 품질이 매우 중요합니다. 현재 데이터 품질을 개선하기 위한 방법들은 노동 집약적이거나 LLM의 환각으로 인한 사실 오류가 발생하기 쉽습니다. 본 논문은 기존의 지시 데이터 품질을 높여 인간의 가치와 더 잘 조율하기 위한 방법을 탐구하며, ReAlign이라는 간단하면서도 효과적인 접근 방식을 소개합니다. 이 방식은 지시 데이터의 응답을 미리 설정된 기준과 수집된 증거에 더 잘 맞는 형식으로 재구성합니다. 이 접근법은 인간 주석, 환각, 그리고 확장의 어려움을 최소화하며, 기존의 조율 기술과 직교적으로 작동합니다. 실험적으로, ReAlign은 LLM의 일반적인 조율 능력, 수학적 추론, 사실성, 그리고 가독성을 크게 향상시킵니다. 고무적으로도, 추가 데이터나 고급 훈련 기술을 도입하지 않고 단순히 응답을 재구성함으로써, LLaMA-2-13B의 GSM8K에서의 수학적 추론 능력 정확도가 46.77%에서 56.63%로 향상되었습니다. 또한, ReAlign 데이터의 단 5%만으로도 Alpaca 데이터셋으로 측정한 일반적인 조율 능력이 67% 증가했습니다. 이 연구는 LLM의 과학적 이해와 기계적 해석 가능성에 대한 추가 연구의 필요성을 강조합니다. 향후 연구를 지원하기 위해 관련 코드와 데이터를 https://github.com/GAIR-NLP/ReAlign에서 공개적으로 접근 가능하게 하였습니다.

English

The quality of finetuning data is crucial for aligning large language models (LLMs) with human values. Current methods to improve data quality are either labor-intensive or prone to factual errors caused by LLM hallucinations. This paper explores elevating the quality of existing instruction data to better align with human values, introducing a simple and effective approach named ReAlign, which reformats the responses of instruction data into a format that better aligns with pre-established criteria and the collated evidence. This approach minimizes human annotation, hallucination, and the difficulty in scaling, remaining orthogonal to existing alignment techniques. Experimentally, ReAlign significantly boosts the general alignment ability, math reasoning, factuality, and readability of the LLMs. Encouragingly, without introducing any additional data or advanced training techniques, and merely by reformatting the response, LLaMA-2-13B's mathematical reasoning ability on GSM8K can be improved from 46.77% to 56.63% in accuracy. Additionally, a mere 5% of ReAlign data yields a 67% boost in general alignment ability measured by the Alpaca dataset. This work highlights the need for further research into the science and mechanistic interpretability of LLMs. We have made the associated code and data publicly accessible to support future studies at https://github.com/GAIR-NLP/ReAlign.

재구성된 정렬

Reformatted Alignment

초록

Support