RecGPT技術報告
RecGPT Technical Report
July 30, 2025
作者: Chao Yi, Dian Chen, Gaoyang Guo, Jiakai Tang, Jian Wu, Jing Yu, Sunhao Dai, Wen Chen, Wenjun Yang, Yuning Jiang, Zhujin Gao, Bo Zheng, Chi Li, Dimin Wang, Dixuan Wang, Fan Li, Fan Zhang, Haibin Chen, Haozhuang Liu, Jialin Zhu, Jiamang Wang, Jiawei Wu, Jin Cui, Ju Huang, Kai Zhang, Kan Liu, Lang Tian, Liang Rao, Longbin Li, Lulu Zhao, Mao Zhang, Na He, Peiyang Wang, Qiqi Huang, Tao Luo, Wenbo Su, Xiaoxiao He, Xin Tong, Xu Chen, Xunke Xi, Yang Li, Yaxuan Wu, Yeqiu Yang, Yi Hu, Yinnan Song, Yuchen Li, Yujie Luo, Yujin Yuan, Yuliang Yan, Zhengyang Wang, Zhibo Xiao, Zhixin Ma, Zile Zhou
cs.AI
摘要
推薦系統是人工智慧最具影響力的應用之一,作為連接用戶、商家和平台的關鍵基礎設施。然而,當前大多數工業系統仍嚴重依賴於歷史共現模式和日誌擬合目標,即優化過去的用戶互動而不明確建模用戶意圖。這種日誌擬合方法往往導致過度擬合狹窄的歷史偏好,無法捕捉用戶不斷變化和潛在的興趣。因此,它強化了過濾氣泡和長尾現象,最終損害用戶體驗並威脅整個推薦生態系統的可持續性。
為應對這些挑戰,我們重新思考了推薦系統的整體設計範式,並提出了RecGPT,這是一個將用戶意圖置於推薦流程核心的下一代框架。通過將大型語言模型(LLMs)整合到用戶興趣挖掘、項目檢索和解釋生成的關鍵階段,RecGPT將日誌擬合推薦轉變為以意圖為中心的過程。為了有效地將通用LLMs大規模對齊到上述特定領域的推薦任務,RecGPT採用了多階段訓練範式,該範式整合了推理增強的前對齊和自我訓練進化,並由人類-LLM協作評判系統指導。目前,RecGPT已在淘寶App上全面部署。線上實驗表明,RecGPT在各利益相關方中實現了持續的性能提升:用戶受益於內容多樣性和滿意度的增加,商家和平台獲得了更大的曝光和轉化率。這些在所有利益相關方中的全面改善結果驗證了LLM驅動、以意圖為中心的設計能夠促進更可持續和互利的推薦生態系統。
English
Recommender systems are among the most impactful applications of artificial
intelligence, serving as critical infrastructure connecting users, merchants,
and platforms. However, most current industrial systems remain heavily reliant
on historical co-occurrence patterns and log-fitting objectives, i.e.,
optimizing for past user interactions without explicitly modeling user intent.
This log-fitting approach often leads to overfitting to narrow historical
preferences, failing to capture users' evolving and latent interests. As a
result, it reinforces filter bubbles and long-tail phenomena, ultimately
harming user experience and threatening the sustainability of the whole
recommendation ecosystem.
To address these challenges, we rethink the overall design paradigm of
recommender systems and propose RecGPT, a next-generation framework that places
user intent at the center of the recommendation pipeline. By integrating large
language models (LLMs) into key stages of user interest mining, item retrieval,
and explanation generation, RecGPT transforms log-fitting recommendation into
an intent-centric process. To effectively align general-purpose LLMs to the
above domain-specific recommendation tasks at scale, RecGPT incorporates a
multi-stage training paradigm, which integrates reasoning-enhanced
pre-alignment and self-training evolution, guided by a Human-LLM cooperative
judge system. Currently, RecGPT has been fully deployed on the Taobao App.
Online experiments demonstrate that RecGPT achieves consistent performance
gains across stakeholders: users benefit from increased content diversity and
satisfaction, merchants and the platform gain greater exposure and conversions.
These comprehensive improvement results across all stakeholders validates that
LLM-driven, intent-centric design can foster a more sustainable and mutually
beneficial recommendation ecosystem.