ChatPaper.aiChatPaper

悟空:迈向大规模推荐的扩展定律

Wukong: Towards a Scaling Law for Large-Scale Recommendation

March 4, 2024
作者: Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen
cs.AI

摘要

在模型质量的可持续改进中,规模律发挥着关键作用。不幸的是,迄今为止的推荐模型并没有展现出类似于大型语言模型领域中观察到的规模律,这是由于它们的扩展机制的低效性。这一限制在将这些模型调整到日益复杂的真实世界数据集方面带来了重大挑战。在本文中,我们提出了一种基于纯粹堆叠因子分解机的有效网络架构,以及一种协同扩展策略,统称为Wukong,以在推荐领域建立规模律。Wukong的独特设计使其能够通过更高更宽的层简单捕捉各种任意阶的交互作用。我们在六个公共数据集上进行了广泛评估,结果表明Wukong在质量方面始终优于最先进的模型。此外,我们在一个内部的大规模数据集上评估了Wukong的可扩展性。结果显示,Wukong在质量上保持优势,同时在模型复杂度的两个数量级范围内保持规模律,延伸至100 Gflop或相当于GPT-3/LLaMa-2规模的总训练计算,而之前的方法则表现不佳。
English
Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong's unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong's scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 Gflop or equivalently up to GPT-3/LLaMa-2 scale of total training compute, where prior arts fall short.
PDF171December 15, 2024