ChatPaper.aiChatPaper

精进对比学习与同态关系在多模态推荐中的应用

Refining Contrastive Learning and Homography Relations for Multi-Modal Recommendation

August 19, 2025
作者: Shouxing Ma, Yawen Zeng, Shiqing Wu, Guandong Xu
cs.AI

摘要

多模态推荐系统致力于利用物品的丰富模态信息(如图像和文本描述)来提升推荐性能。当前方法凭借图神经网络强大的结构建模能力已取得显著成功。然而,这些方法在实际场景中常受限于数据稀疏问题。尽管对比学习和同构图(即同质图)被用来应对数据稀疏挑战,现有方法仍存在两大局限:1)简单的多模态特征对比未能生成有效表示,导致模态共享特征中的噪声以及模态独有特征中有价值信息的丢失;2)对用户兴趣与物品共现之间同构关系探索的不足,使得用户-物品交互的挖掘不够全面。 针对上述局限,我们提出了一种新颖的框架——REfining multi-modAl contRastive learning and hoMography relations(REARM)。具体而言,我们通过引入元网络和正交约束策略来完善多模态对比学习,这些策略能够滤除模态共享特征中的噪声,并保留模态独有特征中与推荐相关的信息。为了有效挖掘同质关系,我们将新构建的用户兴趣图和物品共现图与现有的用户共现图及物品语义图相结合,用于图学习。在三个真实世界数据集上的广泛实验表明,REARM相较于多种最先进的基线方法具有显著优势。我们的可视化结果进一步展示了REARM在区分模态共享与模态独有特征方面的改进。代码可在此处获取:https://github.com/MrShouxingMa/REARM。
English
Multi-modal recommender system focuses on utilizing rich modal information ( i.e., images and textual descriptions) of items to improve recommendation performance. The current methods have achieved remarkable success with the powerful structure modeling capability of graph neural networks. However, these methods are often hindered by sparse data in real-world scenarios. Although contrastive learning and homography ( i.e., homogeneous graphs) are employed to address the data sparsity challenge, existing methods still suffer two main limitations: 1) Simple multi-modal feature contrasts fail to produce effective representations, causing noisy modal-shared features and loss of valuable information in modal-unique features; 2) The lack of exploration of the homograph relations between user interests and item co-occurrence results in incomplete mining of user-item interplay. To address the above limitations, we propose a novel framework for REfining multi-modAl contRastive learning and hoMography relations (REARM). Specifically, we complement multi-modal contrastive learning by employing meta-network and orthogonal constraint strategies, which filter out noise in modal-shared features and retain recommendation-relevant information in modal-unique features. To mine homogeneous relationships effectively, we integrate a newly constructed user interest graph and an item co-occurrence graph with the existing user co-occurrence and item semantic graphs for graph learning. The extensive experiments on three real-world datasets demonstrate the superiority of REARM to various state-of-the-art baselines. Our visualization further shows an improvement made by REARM in distinguishing between modal-shared and modal-unique features. Code is available https://github.com/MrShouxingMa/REARM{here}.
PDF02August 21, 2025