시스템 수준 자연어 피드백

초록

자연어(NL) 피드백은 사용자 경험에 대한 풍부한 정보를 담고 있습니다. 기존 연구들은 주로 인스턴스 수준의 접근에 초점을 맞추어, 피드백을 특정 예시를 개선하는 데 사용하며 시스템 전반에 적용하는 것을 간과해 왔습니다. 본 논문은 자연어 피드백의 시스템 수준 활용을 위한 일반적인 프레임워크를 제안합니다. 우리는 피드백을 활용하여 인간이 참여하는 프로세스 내에서 시스템 수준의 설계 결정을 공식화함으로써 더 나은 모델을 생성하는 방법을 보여줍니다. 특히, 이는 (i) 작업을 위한 메트릭 설계와 (ii) 모델 응답을 개선하기 위한 언어 모델 프롬프트 설계를 통해 이루어집니다. 우리는 검색 쿼리 생성과 대화 응답 생성 개선을 위한 두 가지 사례 연구를 통해 시스템 수준 피드백 사용의 효과를 입증합니다. 시스템 수준 피드백과 인스턴스 수준 피드백의 결합이 추가적인 성능 향상을 가져오며, GPT-3.5가 작성한 피드백보다 인간이 작성한 인스턴스 수준 피드백이 더 근거 있는 개선을 이끌어냄을 보여줍니다. 이는 시스템 구축에 있어 인간 피드백의 중요성을 강조합니다.

English

Natural language (NL) feedback contains rich information about the user experience. Existing studies focus on an instance-level approach, where feedback is used to refine specific examples, disregarding its system-wide application. This paper proposes a general framework for unlocking the system-level use of NL feedback. We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process -- in order to produce better models. In particular this is done through: (i) metric design for tasks; and (ii) language model prompt design for refining model responses. We conduct two case studies of this approach for improving search query generation and dialog response generation, demonstrating the effectiveness of the use of system-level feedback. We show the combination of system-level feedback and instance-level feedback brings further gains, and that human written instance-level feedback results in more grounded refinements than GPT-3.5 written ones, underlying the importance of human feedback for building systems.

시스템 수준 자연어 피드백

System-Level Natural Language Feedback

초록

Support