ChatPaper.aiChatPaper

FollowIR:評估和教授資訊檢索模型以遵循指示

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

March 22, 2024
作者: Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini
cs.AI

摘要

現代的大型語言模型(LLMs)能夠遵循長且複雜的指示,從而支持多樣的使用者任務。然而,儘管資訊檢索(IR)模型使用LLMs作為其架構的基礎,幾乎所有這些模型仍然僅接受查詢作為輸入,沒有指示。對於那些最近接受指示的少數模型,它們如何使用這些指示尚不清楚。我們引入了我們的數據集FollowIR,其中包含一個嚴格的指示評估基準以及一個用於幫助IR模型更好地遵循現實世界指示的訓練集。FollowIR基於TREC會議的悠久歷史:正如TREC為人類標註者提供指示(也稱為敘述)以確定文件相關性一樣,IR模型也應該能夠根據這些詳細的指示來理解並判斷相關性。我們的評估基準從三個經過深入評估的TREC收藏開始,並修改標註者的指示,重新標註相關文件。通過這個過程,我們可以衡量IR模型如何遵循指示,透過一個新的成對評估框架。我們的結果顯示現有的檢索模型未能正確使用指示,僅將其用於基本關鍵字,並難以理解長篇信息。然而,我們展示了IR模型可以學會遵循複雜指示的可能性:我們的新FollowIR-7B模型在我們的訓練集上微調後有顯著改進(超過13%)。
English
Modern Large Language Models (LLMs) are capable of following long and complex instructions that enable a diverse amount of user tasks. However, despite Information Retrieval (IR) models using LLMs as the backbone of their architectures, nearly all of them still only take queries as input, with no instructions. For the handful of recent models that do take instructions, it's unclear how they use them. We introduce our dataset FollowIR, which contains a rigorous instruction evaluation benchmark as well as a training set for helping IR models learn to better follow real-world instructions. FollowIR builds off the long history of the TREC conferences: as TREC provides human annotators with instructions (also known as narratives) to determine document relevance, so should IR models be able to understand and decide relevance based on these detailed instructions. Our evaluation benchmark starts with three deeply judged TREC collections and alters the annotator instructions, re-annotating relevant documents. Through this process, we can measure how well IR models follow instructions, through a new pairwise evaluation framework. Our results indicate that existing retrieval models fail to correctly use instructions, using them for basic keywords and struggling to understand long-form information. However, we show that it is possible for IR models to learn to follow complex instructions: our new FollowIR-7B model has significant improvements (over 13%) after fine-tuning on our training set.

Summary

AI-Generated Summary

PDF111December 15, 2024