理解协同过滤中的嵌入缩放

摘要

将推荐模型扩展为大规模推荐模型已成为最广泛讨论的话题之一。近期研究重点已超越单纯扩展嵌入维度，因为人们认为单纯扩展嵌入可能导致性能下降。尽管已有一些关于嵌入的初步观察，但其不可扩展性的根本原因仍不明确。此外，性能下降是否在不同类型模型和数据集上普遍存在，仍是一个未探索的领域。针对嵌入维度对性能的影响，我们在10个稀疏度和规模各异的数据集上，采用4种代表性经典架构进行了大规模实验。我们意外地观察到了两种新现象：双峰现象和对数现象。对于前者，随着嵌入维度的增加，性能先提升后下降，再次上升，最终回落；对于后者，则呈现出完美的对数曲线。我们的贡献有三方面：首先，我们在扩展协同过滤模型时发现了两种新现象；其次，我们深入理解了双峰现象背后的原因；最后，我们从理论上分析了协同过滤模型的噪声鲁棒性，其结果与实证观察相符。

English

Scaling recommendation models into large recommendation models has become one of the most widely discussed topics. Recent efforts focus on components beyond the scaling embedding dimension, as it is believed that scaling embedding may lead to performance degradation. Although there have been some initial observations on embedding, the root cause of their non-scalability remains unclear. Moreover, whether performance degradation occurs across different types of models and datasets is still an unexplored area. Regarding the effect of embedding dimensions on performance, we conduct large-scale experiments across 10 datasets with varying sparsity levels and scales, using 4 representative classical architectures. We surprisingly observe two novel phenomenon: double-peak and logarithmic. For the former, as the embedding dimension increases, performance first improves, then declines, rises again, and eventually drops. For the latter, it exhibits a perfect logarithmic curve. Our contributions are threefold. First, we discover two novel phenomena when scaling collaborative filtering models. Second, we gain an understanding of the underlying causes of the double-peak phenomenon. Lastly, we theoretically analyze the noise robustness of collaborative filtering models, with results matching empirical observations.

理解协同过滤中的嵌入缩放

Understanding Embedding Scaling in Collaborative Filtering

摘要

Support