지시어 추종 정보 검색을 위한 이중 시점 학습

초록

명령어-따르기 정보 검색(IF-IR)은 단순히 질의와 관련된 문서를 찾는 것을 넘어, 필수 속성, 제외 조건, 출력 선호도 등 명시적 사용자 제약을 준수해야 하는 검색 시스템을 연구합니다. 그러나 대부분의 검색기는 의미적 관련성을 위주로 훈련되어 주제와 일치하는 문서와 명령어를 충족하는 문서를 구분하지 못하는 경우가 많습니다. 본 연구에서는 극성 반전(polarity reversal)에 기반한 이중 관점 데이터 합성 전략을 제안합니다. 즉, 특정 질의, 명령어 하에서 관련성이 있는 문서, 그리고 질의에는 부합하지만 명령어를 위반하는 하드 네거티브(hard negative) 문서가 주어졌을 때, LLM을 활용하여 두 문서의 관련성 레이블이 뒤바뀌는 상보적 명령어를 생성합니다. 동일한 문서 쌍을 관련성 레이블이 반전된 상보적 명령어 하에 제시함으로써, 훈련 신호는 검색기로 하여금 고정된 주제적 단서에 의존하기보다는 명령어를 통해 동일한 후보 집합을 재평가하도록 강제합니다. 3억 5백만 개 파라미터 인코더에서 본 방법은 FollowIR 벤치마크 성능을 45% 향상시켜, 규모가 유사하거나 더 큰 범용 임베딩 모델을 능가했습니다. 동일한 데이터 예산 하에서의 직접 비교를 통해 데이터 다양성과 명령어 감독이 상보적 역할을 한다는 점을 추가로 입증했습니다. 즉, 전자는 일반적인 검색 품질을 유지하는 반면, 후자는 명령어 민감도를 향상시킵니다. 이러한 결과는 광범위한 능력과 명령어 인식을 모두 갖춘 검색 시스템 구축을 위한 표적 데이터 합성의 가치를 부각합니다.

English

Instruction-following information retrieval (IF-IR) studies retrieval systems that must not only find documents relevant to a query, but also obey explicit user constraints such as required attributes, exclusions, or output preferences. However, most retrievers are trained primarily for semantic relevance and often fail to distinguish documents that match the topic from those that satisfy the instruction. We propose a dual-view data synthesis strategy based on polarity reversal: given a query, a document that is relevant under the instruction, and a hard negative that matches the query but violates the instruction, we prompt an LLM to generate a complementary instruction under which the two documents swap relevance labels. By presenting the same document pair under complementary instructions that invert their relevance labels, the training signal forces the retriever to reconsider the same candidate set through the instruction, rather than relying on fixed topical cues. On a 305M-parameter encoder, our method improves performance on the FollowIR benchmark by 45%, surpassing general-purpose embedding models of comparable or larger scale. Through head-to-head comparisons at matched data budgets, we further show that data diversity and instruction supervision play complementary roles: the former preserves general retrieval quality, while the latter improves instruction sensitivity. These results highlight the value of targeted data synthesis for building retrieval systems that are both broadly capable and instruction-aware.

지시어 추종 정보 검색을 위한 이중 시점 학습

Dual-View Training for Instruction-Following Information Retrieval

초록

Support