ChatPaper.aiChatPaper

推荐系统中交互感知的单元语义概念提取

Extracting Interaction-Aware Monosemantic Concepts in Recommender Systems

November 22, 2025
作者: Dor Arviv, Yehonatan Elisha, Oren Barkan, Noam Koenigstein
cs.AI

摘要

我们提出了一种从推荐系统的用户和物品嵌入向量中提取单语义神经元的方法。单语义神经元被定义为与连贯可解释概念对齐的潜在维度。该方法采用稀疏自编码器(SAE)来揭示预训练表征内部的语义结构。与语言模型研究不同,推荐系统中的单语义特性必须保持独立用户嵌入与物品嵌入之间的交互关系。为此,我们引入了预测感知训练目标,通过冻结推荐模型进行反向传播,并使学习到的潜在结构与模型的用户-物品亲和度预测保持一致。最终获得的神经元能够捕捉类型、流行度、时间趋势等属性,支持包括定向过滤和内容推广在内的后置控制操作,且无需修改基础模型。本方法适用于不同推荐模型与数据集,为可解释可控的个性化推荐提供了实用工具。代码与评估资源详见https://github.com/DeltaLabTLV/Monosemanticity4Rec。
English
We present a method for extracting monosemantic neurons, defined as latent dimensions that align with coherent and interpretable concepts, from user and item embeddings in recommender systems. Our approach employs a Sparse Autoencoder (SAE) to reveal semantic structure within pretrained representations. In contrast to work on language models, monosemanticity in recommendation must preserve the interactions between separate user and item embeddings. To achieve this, we introduce a prediction aware training objective that backpropagates through a frozen recommender and aligns the learned latent structure with the model's user-item affinity predictions. The resulting neurons capture properties such as genre, popularity, and temporal trends, and support post hoc control operations including targeted filtering and content promotion without modifying the base model. Our method generalizes across different recommendation models and datasets, providing a practical tool for interpretable and controllable personalization. Code and evaluation resources are available at https://github.com/DeltaLabTLV/Monosemanticity4Rec.
PDF22February 7, 2026