RecGPT技术报告
RecGPT Technical Report
July 30, 2025
作者: Chao Yi, Dian Chen, Gaoyang Guo, Jiakai Tang, Jian Wu, Jing Yu, Sunhao Dai, Wen Chen, Wenjun Yang, Yuning Jiang, Zhujin Gao, Bo Zheng, Chi Li, Dimin Wang, Dixuan Wang, Fan Li, Fan Zhang, Haibin Chen, Haozhuang Liu, Jialin Zhu, Jiamang Wang, Jiawei Wu, Jin Cui, Ju Huang, Kai Zhang, Kan Liu, Lang Tian, Liang Rao, Longbin Li, Lulu Zhao, Mao Zhang, Na He, Peiyang Wang, Qiqi Huang, Tao Luo, Wenbo Su, Xiaoxiao He, Xin Tong, Xu Chen, Xunke Xi, Yang Li, Yaxuan Wu, Yeqiu Yang, Yi Hu, Yinnan Song, Yuchen Li, Yujie Luo, Yujin Yuan, Yuliang Yan, Zhengyang Wang, Zhibo Xiao, Zhixin Ma, Zile Zhou
cs.AI
摘要
推荐系统作为人工智能最具影响力的应用之一,是连接用户、商家和平台的关键基础设施。然而,当前大多数工业系统仍严重依赖历史共现模式和日志拟合目标,即优化过去的用户交互而未能显式建模用户意图。这种日志拟合方法往往导致对狭窄历史偏好的过拟合,无法捕捉用户不断演变和潜在的兴趣。因此,它加剧了信息茧房和长尾现象,最终损害用户体验并威胁整个推荐生态系统的可持续性。
为应对这些挑战,我们重新思考了推荐系统的整体设计范式,提出了RecGPT这一将用户意图置于推荐流程核心的下一代框架。通过将大型语言模型(LLMs)整合到用户兴趣挖掘、物品检索和解释生成的关键阶段,RecGPT将日志拟合推荐转变为以意图为中心的过程。为了有效将通用LLMs大规模对齐到上述特定领域的推荐任务,RecGPT采用了一种多阶段训练范式,该范式集成了推理增强的预对齐和自训练进化,并由人机协作的评判系统指导。目前,RecGPT已在淘宝App上全面部署。在线实验表明,RecGPT在各方利益相关者中均实现了持续的性能提升:用户受益于内容多样性和满意度的增加,商家和平台获得了更大的曝光和转化率。这些全面的改进结果验证了LLM驱动的、以意图为中心的设计能够培育一个更可持续、互利共赢的推荐生态系统。
English
Recommender systems are among the most impactful applications of artificial
intelligence, serving as critical infrastructure connecting users, merchants,
and platforms. However, most current industrial systems remain heavily reliant
on historical co-occurrence patterns and log-fitting objectives, i.e.,
optimizing for past user interactions without explicitly modeling user intent.
This log-fitting approach often leads to overfitting to narrow historical
preferences, failing to capture users' evolving and latent interests. As a
result, it reinforces filter bubbles and long-tail phenomena, ultimately
harming user experience and threatening the sustainability of the whole
recommendation ecosystem.
To address these challenges, we rethink the overall design paradigm of
recommender systems and propose RecGPT, a next-generation framework that places
user intent at the center of the recommendation pipeline. By integrating large
language models (LLMs) into key stages of user interest mining, item retrieval,
and explanation generation, RecGPT transforms log-fitting recommendation into
an intent-centric process. To effectively align general-purpose LLMs to the
above domain-specific recommendation tasks at scale, RecGPT incorporates a
multi-stage training paradigm, which integrates reasoning-enhanced
pre-alignment and self-training evolution, guided by a Human-LLM cooperative
judge system. Currently, RecGPT has been fully deployed on the Taobao App.
Online experiments demonstrate that RecGPT achieves consistent performance
gains across stakeholders: users benefit from increased content diversity and
satisfaction, merchants and the platform gain greater exposure and conversions.
These comprehensive improvement results across all stakeholders validates that
LLM-driven, intent-centric design can foster a more sustainable and mutually
beneficial recommendation ecosystem.