Shepherd: 言語モデル生成のための批評家

要旨

大規模言語モデルが進化するにつれ、これらのモデルの能力を活用して自身の出力を改善する技術への関心が高まっています。本研究では、Shepherdという、応答を批評し改善案を提案するために特別に調整された言語モデルを紹介します。Shepherdは、未調整のモデルでは識別が難しい多様なエラーを特定し、それらを改善するための提案を行う能力を拡張しています。我々のアプローチの中核となるのは、コミュニティからのフィードバックと人間による注釈からキュレートされた高品質なフィードバックデータセットです。Shepherdは小規模（7Bパラメータ）であるにもかかわらず、その批評はChatGPTを含む既存のモデルと同等かそれ以上に評価されています。GPT-4を用いた評価では、Shepherdは競合する代替モデルに対して平均53-87%の勝率を達成しています。人間による評価では、Shepherdは他のモデルを明確に上回り、平均的にChatGPTとほぼ同等の性能を示しています。

English

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a language model specifically tuned to critique responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. At the core of our approach is a high quality feedback dataset, which we curate from community feedback and human annotations. Even though Shepherd is small (7B parameters), its critiques are either equivalent or preferred to those from established models including ChatGPT. Using GPT-4 for evaluation, Shepherd reaches an average win-rate of 53-87% compared to competitive alternatives. In human evaluation, Shepherd strictly outperforms other models and on average closely ties with ChatGPT.

Shepherd: 言語モデル生成のための批評家

Shepherd: A Critic for Language Model Generation

要旨

Support