基于主动用户指令的交互式推荐代理
Interactive Recommendation Agent with Active User Commands
September 25, 2025
作者: Jiakai Tang, Yujie Luo, Xunke Xi, Fei Sun, Xueyang Feng, Sunhao Dai, Chao Yi, Dian Chen, Zhujin Gao, Yang Li, Xu Chen, Wen Chen, Jian Wu, Yuning Jiang, Bo Zheng
cs.AI
摘要
传统的推荐系统依赖于被动的反馈机制,将用户局限于简单的选择,如“喜欢”和“不喜欢”。然而,这些粗粒度的信号无法捕捉用户复杂的行为动机和意图。相应地,现有系统也无法区分哪些具体项目属性促使用户满意或不满意,从而导致偏好建模不准确。这些根本性限制在用户意图与系统解读之间形成了持久的鸿沟,最终削弱了用户满意度并损害了系统效能。
为解决这些局限,我们引入了交互式推荐流(Interactive Recommendation Feed, IRF),这一开创性范式允许在主流推荐流中使用自然语言指令。与将用户局限于被动隐性行为影响的传统系统不同,IRF通过实时的语言指令赋予用户对推荐策略的主动显式控制权。为支持这一范式,我们开发了RecBot,一种双代理架构,其中解析代理(Parser Agent)将语言表达转化为结构化偏好,而规划代理(Planner Agent)则动态编排自适应工具链,实现即时策略调整。为实现实际部署,我们采用模拟增强的知识蒸馏技术,在保持强大推理能力的同时实现高效性能。通过广泛的离线实验和长期在线实验,RecBot在用户满意度和业务成果方面均展现出显著提升。
English
Traditional recommender systems rely on passive feedback mechanisms that
limit users to simple choices such as like and dislike. However, these
coarse-grained signals fail to capture users' nuanced behavior motivations and
intentions. In turn, current systems cannot also distinguish which specific
item attributes drive user satisfaction or dissatisfaction, resulting in
inaccurate preference modeling. These fundamental limitations create a
persistent gap between user intentions and system interpretations, ultimately
undermining user satisfaction and harming system effectiveness.
To address these limitations, we introduce the Interactive Recommendation
Feed (IRF), a pioneering paradigm that enables natural language commands within
mainstream recommendation feeds. Unlike traditional systems that confine users
to passive implicit behavioral influence, IRF empowers active explicit control
over recommendation policies through real-time linguistic commands. To support
this paradigm, we develop RecBot, a dual-agent architecture where a Parser
Agent transforms linguistic expressions into structured preferences and a
Planner Agent dynamically orchestrates adaptive tool chains for on-the-fly
policy adjustment. To enable practical deployment, we employ
simulation-augmented knowledge distillation to achieve efficient performance
while maintaining strong reasoning capabilities. Through extensive offline and
long-term online experiments, RecBot shows significant improvements in both
user satisfaction and business outcomes.