麒麟:一个基于应用级用户会话的多模态信息检索数据集
Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions
March 1, 2025
作者: Jia Chen, Qian Dong, Haitao Li, Xiaohui He, Yan Gao, Shaosheng Cao, Yi Wu, Ping Yang, Chen Xu, Yao Hu, Qingyao Ai, Yiqun Liu
cs.AI
摘要
用戶生成內容(UGC)社群,尤其是那些包含多模態內容的社群,通過將視覺與文本信息整合到結果(或項目)中,提升了用戶體驗。近年來,在複雜系統中提升搜索與推薦(S&R)服務的用戶體驗這一挑戰,已引起了學術界與產業界的廣泛關注。然而,高質量數據集的缺乏限制了多模態S&R研究的進展。為應對開發更優S&R服務的日益增長需求,本文提出了一個新穎的多模態信息檢索數據集,名為Qilin。該數據集採集自小紅書,這是一個擁有超過3億月活躍用戶且平均搜索滲透率超過70%的熱門社交平台。與現有數據集相比,Qilin提供了包含圖文筆記、視頻筆記、商業筆記及直接答案等多樣化結果的用戶會話全面集合,促進了跨多種任務設置的高級多模態神經檢索模型的開發。為更好地建模用戶滿意度並支持異構用戶行為分析,我們還收集了廣泛的APP級上下文信號及真實用戶反饋。值得注意的是,Qilin包含了觸發深度問答(DQA)模塊的搜索請求中用戶偏愛的答案及其參考結果。這不僅允許訓練與評估檢索增強生成(RAG)管道,還能探索此類模塊如何影響用戶的搜索行為。通過全面的分析與實驗,我們為進一步改進S&R系統提供了有趣的發現與見解。我們希望Qilin將對未來帶有S&R服務的多模態內容平台的發展做出重大貢獻。
English
User-generated content (UGC) communities, especially those featuring
multimodal content, improve user experiences by integrating visual and textual
information into results (or items). The challenge of improving user
experiences in complex systems with search and recommendation (S\&R) services
has drawn significant attention from both academia and industry these years.
However, the lack of high-quality datasets has limited the research progress on
multimodal S\&R. To address the growing need for developing better S\&R
services, we present a novel multimodal information retrieval dataset in this
paper, namely Qilin. The dataset is collected from Xiaohongshu, a popular
social platform with over 300 million monthly active users and an average
search penetration rate of over 70\%. In contrast to existing datasets,
Qilin offers a comprehensive collection of user sessions with
heterogeneous results like image-text notes, video notes, commercial notes, and
direct answers, facilitating the development of advanced multimodal neural
retrieval models across diverse task settings. To better model user
satisfaction and support the analysis of heterogeneous user behaviors, we also
collect extensive APP-level contextual signals and genuine user feedback.
Notably, Qilin contains user-favored answers and their referred results for
search requests triggering the Deep Query Answering (DQA) module. This allows
not only the training \& evaluation of a Retrieval-augmented Generation (RAG)
pipeline, but also the exploration of how such a module would affect users'
search behavior. Through comprehensive analysis and experiments, we provide
interesting findings and insights for further improving S\&R systems. We hope
that Qilin will significantly contribute to the advancement of
multimodal content platforms with S\&R services in the future.Summary
AI-Generated Summary