ChatPaper.aiChatPaper

beeFormer:在推荐系统中弥合语义和交互相似性之间的差距

beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems

September 16, 2024
作者: Vojtěch Vančura, Pavel Kordík, Milan Straka
cs.AI

摘要

推荐系统通常利用文本信息来改善其预测,特别是在冷启动或零-shot推荐场景中,传统的协同过滤方法无法使用。近年来提出了许多用于为推荐系统挖掘文本信息的方法,其中句子转换器是最突出的一个。然而,这些模型是针对预测语义相似性进行训练的,而没有利用与推荐系统特定隐藏模式的交互数据。在本文中,我们提出了beeFormer,一个用于训练句子转换器模型的框架,该框架结合了交互数据。我们展示了使用beeFormer训练的模型能够在不仅胜过语义相似性句子转换器,还胜过传统协同过滤方法的情况下,在数据集之间转移知识。我们还展示了在来自不同领域的多个数据集上训练能够在单个模型中累积知识的可能性,从而解锁了为推荐系统挖掘文本表示而训练通用的、领域无关的句子转换器模型的可能性。我们发布了源代码、训练模型和其他详细信息,以便复制我们的实验,网址为https://github.com/recombee/beeformer。
English
Recommender systems often use text-side information to improve their predictions, especially in cold-start or zero-shot recommendation scenarios, where traditional collaborative filtering approaches cannot be used. Many approaches to text-mining side information for recommender systems have been proposed over recent years, with sentence Transformers being the most prominent one. However, these models are trained to predict semantic similarity without utilizing interaction data with hidden patterns specific to recommender systems. In this paper, we propose beeFormer, a framework for training sentence Transformer models with interaction data. We demonstrate that our models trained with beeFormer can transfer knowledge between datasets while outperforming not only semantic similarity sentence Transformers but also traditional collaborative filtering methods. We also show that training on multiple datasets from different domains accumulates knowledge in a single model, unlocking the possibility of training universal, domain-agnostic sentence Transformer models to mine text representations for recommender systems. We release the source code, trained models, and additional details allowing replication of our experiments at https://github.com/recombee/beeformer.

Summary

AI-Generated Summary

PDF32November 16, 2024