노이즈가 포함된 지시어 미세 조정: 일반화 및 성능에 미치는 영향

초록

명령어 튜닝(instruction-tuning)은 대규모 언어 모델(LLMs)의 과제 해결 능력을 향상시키고 다양한 작업에서 유용한 응답을 생성하는 데 있어 그 활용성을 개선하는 데 중요한 역할을 합니다. 그러나 선행 연구에서는 이러한 모델들이 명령어 표현의 사소한 변화에 민감하다는 점이 입증되었습니다. 본 논문에서는 명령어 튜닝 데이터에 섭동(perturbation)을 도입함으로써 LLMs가 노이즈가 포함된 명령어에 대해 더 강인한 저항성을 갖출 수 있는지 탐구합니다. 특히, 불용어 제거나 단어 순서 섞기와 같은 섭동을 적용한 명령어 튜닝이 널리 사용되는 벤치마크(MMLU, BBH, GSM8K)의 원본 및 섭동 버전에서 LLMs의 성능에 미치는 영향을 분석합니다. 더 나아가 학습 동학과 모델 행동의 잠재적 변화를 평가합니다. 흥미롭게도, 연구 결과는 섭동이 가해진 명령어로 튜닝을 수행하는 경우, 일부 상황에서 하류 작업(downstream task) 성능이 개선될 수 있음을 시사합니다. 이러한 발견은 명령어 튜닝 과정에 섭동이 포함된 명령어를 포함시키는 것이 LLMs가 노이즈가 포함된 사용자 입력에 대해 더 강인해지는 데 중요함을 강조합니다.

English

Instruction-tuning plays a vital role in enhancing the task-solving abilities of large language models (LLMs), improving their usability in generating helpful responses on various tasks. However, previous work has demonstrated that they are sensitive to minor variations in instruction phrasing. In this paper, we explore whether introducing perturbations in instruction-tuning data can enhance LLMs' resistance against noisy instructions. We focus on how instruction-tuning with perturbations, such as removing stop words or shuffling words, affects LLMs' performance on the original and perturbed versions of widely-used benchmarks (MMLU, BBH, GSM8K). We further assess learning dynamics and potential shifts in model behavior. Surprisingly, our results suggest that instruction-tuning on perturbed instructions can, in some cases, improve downstream performance. These findings highlight the importance of including perturbed instructions in instruction-tuning, which can make LLMs more resilient to noisy user inputs.

노이즈가 포함된 지시어 미세 조정: 일반화 및 성능에 미치는 영향

Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance

초록

Support