소형 언어 모델을 위한 자가 학습 기반 자가 수정

초록

대규모 언어 모델(LLM)은 다양한 작업에서 뛰어난 성능을 달성했지만, 여전히 오류가 발생하기 쉽습니다. 주요 과제는 이들이 스스로 오류를 수정할 수 있도록 하는 것입니다. 기존 연구에서는 외부 도구나 대형 독점 모델에 의존해 왔지만, 본 연구는 소규모 언어 모델(SLM)에서 순전히 자체 생성된 데이터를 사용한 반복적 미세 조정을 통해 자기 수정(self-correction)을 탐구합니다. 우리는 여러 알고리즘 설계 선택을 통합한 Self-Taught Self-Correction(STaSC) 알고리즘을 소개합니다. 질문-응답 작업에 대한 실험 결과는 STaSC가 자기 수정을 효과적으로 학습하며, 이로 인해 성능이 크게 향상됨을 보여줍니다. 또한, 우리의 분석은 자기 수정의 메커니즘과 다양한 설계 선택이 학습 동역학 및 전반적인 성능에 미치는 영향에 대한 통찰을 제공합니다. 향후 연구를 지원하기 위해 사용자 친화적인 코드베이스와 경량 모델을 공개합니다.

English

Although large language models (LLMs) have achieved remarkable performance across various tasks, they remain prone to errors. A key challenge is enabling them to self-correct. While prior research has relied on external tools or large proprietary models, this work explores self-correction in small language models (SLMs) through iterative fine-tuning using solely self-generated data. We introduce the Self-Taught Self-Correction (STaSC) algorithm, which incorporates multiple algorithmic design choices. Experimental results on a question-answering task demonstrate that STaSC effectively learns self-correction, leading to significant performance improvements. Our analysis further provides insights into the mechanisms of self-correction and the impact of different design choices on learning dynamics and overall performance. To support future research, we release our user-friendly codebase and lightweight models.

소형 언어 모델을 위한 자가 학습 기반 자가 수정

Self-Taught Self-Correction for Small Language Models

초록

Support