백트레이싱: 쿼리의 원인 추적

초록

많은 온라인 콘텐츠 포털은 사용자가 자신의 이해를 보완하기 위해 질문을 할 수 있도록 허용합니다(예: 강의에 대한 질문). 정보 검색(IR) 시스템은 이러한 사용자 질문에 대한 답변을 제공할 수 있지만, 콘텐츠 개선을 원하는 강사와 같은 콘텐츠 제작자가 해당 질문을 _유발한_ 텍스트 세그먼트를 식별하는 데 직접적으로 도움을 주지는 않습니다. 우리는 사용자 질문을 가장 유발할 가능성이 높은 텍스트 세그먼트를 검색하는 작업인 백트레이싱(backtracing)을 소개합니다. 우리는 백트레이싱이 콘텐츠 전달과 커뮤니케이션 개선에 중요한 세 가지 실제 도메인을 공식화합니다: (a) 강의 도메인에서 학생의 혼란 원인 이해, (b) 뉴스 기사 도메인에서 독자의 호기심 원인 이해, (c) 대화 도메인에서 사용자의 감정 원인 이해. 우리는 인기 있는 정보 검색 방법과 언어 모델링 방법, 이중 인코더, 재순위 지정 및 가능성 기반 방법, 그리고 ChatGPT의 제로샷 성능을 평가합니다. 전통적인 IR 시스템은 의미적으로 관련된 정보를 검색하지만(예: "프로젝션 행렬"에 대한 세부 정보를 "여러 번 프로젝션해도 동일한 지점에 도달하는가?"라는 질문에 대해), 종종 인과적으로 관련된 맥락을 놓칩니다(예: 강사가 "두 번 프로젝션하면 한 번 프로젝션한 것과 동일한 답을 얻는다"고 언급한 부분). 우리의 결과는 백트레이싱에 개선의 여지가 있으며 새로운 검색 접근 방식이 필요함을 보여줍니다. 우리는 우리의 벤치마크가 향후 백트레이싱을 위한 검색 시스템을 개선하고, 콘텐츠 생성을 개선하고 사용자 질문에 영향을 미치는 언어적 트리거를 식별하는 시스템을 탄생시키는 데 기여하기를 바랍니다. 우리의 코드와 데이터는 오픈소스로 제공됩니다: https://github.com/rosewang2008/backtracing.

English

Many online content portals allow users to ask questions to supplement their understanding (e.g., of lectures). While information retrieval (IR) systems may provide answers for such user queries, they do not directly assist content creators -- such as lecturers who want to improve their content -- identify segments that _caused_ a user to ask those questions. We introduce the task of backtracing, in which systems retrieve the text segment that most likely caused a user query. We formalize three real-world domains for which backtracing is important in improving content delivery and communication: understanding the cause of (a) student confusion in the Lecture domain, (b) reader curiosity in the News Article domain, and (c) user emotion in the Conversation domain. We evaluate the zero-shot performance of popular information retrieval methods and language modeling methods, including bi-encoder, re-ranking and likelihood-based methods and ChatGPT. While traditional IR systems retrieve semantically relevant information (e.g., details on "projection matrices" for a query "does projecting multiple times still lead to the same point?"), they often miss the causally relevant context (e.g., the lecturer states "projecting twice gets me the same answer as one projection"). Our results show that there is room for improvement on backtracing and it requires new retrieval approaches. We hope our benchmark serves to improve future retrieval systems for backtracing, spawning systems that refine content generation and identify linguistic triggers influencing user queries. Our code and data are open-sourced: https://github.com/rosewang2008/backtracing.

백트레이싱: 쿼리의 원인 추적

Backtracing: Retrieving the Cause of the Query

초록

Support