인스트럭션 튜닝을 위한 포맷 일관성 탐구

초록

인스트럭션 튜닝은 대형 언어 모델이 인간의 지시를 따르는 능력을 향상시키는 유망한 접근법으로 부상했습니다. 훈련 데이터에서 인스트럭션의 다양성과 수를 증가시키는 것이 일반화 성능을 꾸준히 향상시킬 수 있음이 입증되었으며, 이는 다양한 인스트럭션을 수집하고 기존의 인스트럭션 튜닝 데이터셋을 더 큰 컬렉션으로 통합하려는 최근의 노력을 촉진하고 있습니다. 그러나 각 사용자는 고유한 방식으로 인스트럭션을 표현하며, 서로 다른 데이터셋 간에는 인스트럭션 스타일과 형식, 즉 형식 불일치가 종종 존재합니다. 본 연구에서는 형식 불일치가 인스트럭션 튜닝의 성능에 미치는 영향을 연구합니다. 우리는 "통합 인스트럭션 튜닝"(Unified Instruction Tuning, UIT)이라는 프레임워크를 제안하며, 이는 서로 다른 인스트럭션 튜닝 데이터셋 간의 자동 형식 변환을 위해 OpenAI API를 호출합니다. UIT가 보이지 않는 인스트럭션에 대한 일반화 성능을 성공적으로 개선함으로써, 인스트럭션 튜닝에서 형식 일관성의 중요성을 강조합니다. UIT 프레임워크를 더 실용적으로 만들기 위해, 우리는 자동 형식 변환의 노이즈를 줄이기 위한 새로운 퍼플렉시티 기반 노이즈 제거 방법을 추가로 제안합니다. 또한, OpenAI API와 비슷한 수준의 형식 변환 능력을 달성하는 더 작은 오프라인 모델을 훈련시켜 실제 비용을 절감합니다.

English

Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger collections. However, different users have their unique ways of expressing instructions, and there often exist variations across different datasets in the instruction styles and formats, i.e., format inconsistency. In this work, we study how format inconsistency may impact the performance of instruction tuning. We propose a framework called "Unified Instruction Tuning" (UIT), which calls OpenAI APIs for automatic format transfer among different instruction tuning datasets. We show that UIT successfully improves the generalization performance on unseen instructions, which highlights the importance of format consistency for instruction tuning. To make the UIT framework more practical, we further propose a novel perplexity-based denoising method to reduce the noise of automatic format transfer. We also train a smaller offline model that achieves comparable format transfer capability than OpenAI APIs to reduce costs in practice.

인스트럭션 튜닝을 위한 포맷 일관성 탐구

Exploring Format Consistency for Instruction Tuning

초록

Support