Shepherd: 언어 모델 생성을 위한 비평가

초록

대규모 언어 모델이 발전함에 따라, 이러한 모델의 능력을 활용하여 자체 출력을 개선하는 기술에 대한 관심이 높아지고 있습니다. 본 연구에서는 Shepherd라는 언어 모델을 소개합니다. 이 모델은 응답을 비판하고 개선안을 제안하도록 특별히 조정되어, 미조정 모델의 능력을 넘어 다양한 오류를 식별하고 이를 해결하기 위한 제안을 제공합니다. 우리 접근법의 핵심은 커뮤니티 피드백과 인간 주석으로부터 선별된 고품질 피드백 데이터셋입니다. Shepherd는 크기가 작지만(7B 파라미터), 그 비판은 ChatGPT를 포함한 기존 모델들의 비판과 동등하거나 더 우수합니다. GPT-4를 사용한 평가에서 Shepherd는 경쟁 모델 대비 평균 53-87%의 승률을 기록했습니다. 인간 평가에서는 Shepherd가 다른 모델들을 확실히 앞섰으며, 평균적으로 ChatGPT와 거의 동등한 성능을 보였습니다.

English

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a language model specifically tuned to critique responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. At the core of our approach is a high quality feedback dataset, which we curate from community feedback and human annotations. Even though Shepherd is small (7B parameters), its critiques are either equivalent or preferred to those from established models including ChatGPT. Using GPT-4 for evaluation, Shepherd reaches an average win-rate of 53-87% compared to competitive alternatives. In human evaluation, Shepherd strictly outperforms other models and on average closely ties with ChatGPT.

Shepherd: 언어 모델 생성을 위한 비평가

Shepherd: A Critic for Language Model Generation

초록

Support