ChatPaper.aiChatPaper

RecGPT-V2技术报告

RecGPT-V2 Technical Report

December 16, 2025
作者: Chao Yi, Dian Chen, Gaoyang Guo, Jiakai Tang, Jian Wu, Jing Yu, Mao Zhang, Wen Chen, Wenjun Yang, Yujie Luo, Yuning Jiang, Zhujin Gao, Bo Zheng, Binbin Cao, Changfa Wu, Dixuan Wang, Han Wu, Haoyi Hu, Kewei Zhu, Lang Tian, Lin Yang, Qiqi Huang, Siqi Yang, Wenbo Su, Xiaoxiao He, Xin Tong, Xu Chen, Xunke Xi, Xiaowei Huang, Yaxuan Wu, Yeqiu Yang, Yi Hu, Yujin Yuan, Yuliang Yan, Zile Zhou
cs.AI

摘要

大型语言模型(LLMs)在将推荐系统从隐式行为模式匹配转变为显式意图推理方面展现出巨大潜力。尽管RecGPT-V1通过整合基于LLM的推理到用户兴趣挖掘和物品标签预测中成功开创了这一范式,但其存在四个根本性局限:(1)多推理路径下的计算效率低下与认知冗余;(2)固定模板生成的解释多样性不足;(3)监督学习范式下泛化能力有限;(4)结果导向的评估方式过于简化,未能匹配人类标准。 为解决这些挑战,我们提出具备四项关键创新的RecGPT-V2。首先,分层多智能体系统通过协同合作重构意图推理,在消除认知重复的同时实现多样化意图覆盖。结合压缩用户行为上下文的混合表征推理技术,我们的框架降低60%的GPU消耗,并将独家召回率从9.39%提升至10.99%。其次,元提示框架动态生成上下文自适应提示,使解释多样性提升7.3%。第三,约束强化学习缓解多奖励冲突,实现标签预测准确率提升24.1%,解释接受度提升13.0%。第四,智能体即评判员框架将评估分解为多步推理,提升人类偏好对齐度。淘宝在线A/B测试显示显著提升:点击率+2.98%、详情页浏览量+3.71%、交易额+2.19%、新客转化率+11.46%。RecGPT-V2从技术可行性与商业价值双重维度证实了大规模部署LLM驱动意图推理的可行性,弥合了认知探索与工业应用之间的鸿沟。
English
Large language models (LLMs) have demonstrated remarkable potential in transforming recommender systems from implicit behavioral pattern matching to explicit intent reasoning. While RecGPT-V1 successfully pioneered this paradigm by integrating LLM-based reasoning into user interest mining and item tag prediction, it suffers from four fundamental limitations: (1) computational inefficiency and cognitive redundancy across multiple reasoning routes; (2) insufficient explanation diversity in fixed-template generation; (3) limited generalization under supervised learning paradigms; and (4) simplistic outcome-focused evaluation that fails to match human standards. To address these challenges, we present RecGPT-V2 with four key innovations. First, a Hierarchical Multi-Agent System restructures intent reasoning through coordinated collaboration, eliminating cognitive duplication while enabling diverse intent coverage. Combined with Hybrid Representation Inference that compresses user-behavior contexts, our framework reduces GPU consumption by 60% and improves exclusive recall from 9.39% to 10.99%. Second, a Meta-Prompting framework dynamically generates contextually adaptive prompts, improving explanation diversity by +7.3%. Third, constrained reinforcement learning mitigates multi-reward conflicts, achieving +24.1% improvement in tag prediction and +13.0% in explanation acceptance. Fourth, an Agent-as-a-Judge framework decomposes assessment into multi-step reasoning, improving human preference alignment. Online A/B tests on Taobao demonstrate significant improvements: +2.98% CTR, +3.71% IPV, +2.19% TV, and +11.46% NER. RecGPT-V2 establishes both the technical feasibility and commercial viability of deploying LLM-powered intent reasoning at scale, bridging the gap between cognitive exploration and industrial utility.
PDF161December 18, 2025