LiRank:领英的工业级大规模排序模型
LiRank: Industrial Large Scale Ranking Models at LinkedIn
February 10, 2024
作者: Fedor Borisyuk, Mingzhou Zhou, Qingquan Song, Siyu Zhu, Birjodh Tiwana, Ganesh Parameswaran, Siddharth Dangi, Lars Hertel, Qiang Xiao, Xiaochen Hou, Yunbo Ouyang, Aman Gupta, Sheallika Singh, Dan Liu, Hailing Cheng, Lei Le, Jonathan Hung, Sathiya Keerthi, Ruoyan Wang, Fengyu Zhang, Mohit Kothari, Chen Zhu, Daqi Sun, Yun Dai, Xun Luan, Sirou Zhu, Zhiwei Wang, Neil Daftary, Qianqi Shen, Chengming Jiang, Haichao Wei, Maneesh Varshney, Amol Ghoting, Souvik Ghosh
cs.AI
摘要
我们提出了LiRank,这是LinkedIn的一个大规模排名框架,将最先进的建模架构和优化方法应用于生产中。我们揭示了几项建模改进,包括Residual DCN,它在著名的DCNv2架构中添加了注意力和残差连接。我们分享了将SOTA架构组合和调整以创建统一模型的见解,包括Dense Gating、Transformers和Residual DCN。我们还提出了用于校准的新技术,并描述了我们如何将基于深度学习的探索/利用方法投入生产。为了实现对大型排名模型的有效生产级服务,我们详细介绍了如何使用量化和词汇压缩来训练和压缩模型。我们提供了有关Feed排名、职位推荐和广告点击率(CTR)预测大规模用例的部署设置的详细信息。通过阐明最有效的技术方法,我们总结了从各种A/B测试中学到的经验。这些想法已经在LinkedIn各个方面带来了相对指标的提升:Feed中会员会话+0.5%,职位搜索和推荐的合格工作申请+1.76%,广告CTR+4.3%。我们希望这项工作能为有兴趣利用大规模深度排名系统的从业者提供实用见解和解决方案。
English
We present LiRank, a large-scale ranking framework at LinkedIn that brings to
production state-of-the-art modeling architectures and optimization methods. We
unveil several modeling improvements, including Residual DCN, which adds
attention and residual connections to the famous DCNv2 architecture. We share
insights into combining and tuning SOTA architectures to create a unified
model, including Dense Gating, Transformers and Residual DCN. We also propose
novel techniques for calibration and describe how we productionalized deep
learning based explore/exploit methods. To enable effective, production-grade
serving of large ranking models, we detail how to train and compress models
using quantization and vocabulary compression. We provide details about the
deployment setup for large-scale use cases of Feed ranking, Jobs
Recommendations, and Ads click-through rate (CTR) prediction. We summarize our
learnings from various A/B tests by elucidating the most effective technical
approaches. These ideas have contributed to relative metrics improvements
across the board at LinkedIn: +0.5% member sessions in the Feed, +1.76%
qualified job applications for Jobs search and recommendations, and +4.3% for
Ads CTR. We hope this work can provide practical insights and solutions for
practitioners interested in leveraging large-scale deep ranking systems.