ChatPaper.aiChatPaper

具備主動用戶指令的互動式推薦代理

Interactive Recommendation Agent with Active User Commands

September 25, 2025
作者: Jiakai Tang, Yujie Luo, Xunke Xi, Fei Sun, Xueyang Feng, Sunhao Dai, Chao Yi, Dian Chen, Zhujin Gao, Yang Li, Xu Chen, Wen Chen, Jian Wu, Yuning Jiang, Bo Zheng
cs.AI

摘要

傳統推薦系統依賴於被動的反饋機制,僅限於用戶進行簡單的選擇,如喜歡或不喜歡。然而,這些粗粒度的信號無法捕捉用戶細膩的行為動機和意圖。因此,現有系統也無法區分哪些具體的項目屬性驅動了用戶的滿意或不滿,導致偏好建模不準確。這些根本性的限制在用戶意圖與系統解釋之間造成了持久的鴻溝,最終削弱了用戶滿意度並損害了系統效能。 為了解決這些限制,我們引入了互動推薦流(Interactive Recommendation Feed, IRF),這是一種開創性的範式,允許在主流的推薦流中使用自然語言指令。與傳統系統將用戶限制在被動的隱含行為影響不同,IRF通過實時的語言指令賦予用戶對推薦策略的主動顯式控制。為了支持這一範式,我們開發了RecBot,這是一種雙代理架構,其中解析代理(Parser Agent)將語言表達轉化為結構化的偏好,而規劃代理(Planner Agent)則動態協調自適應工具鏈以實現即時策略調整。為了實現實際部署,我們採用模擬增強知識蒸餾,在保持強大推理能力的同時實現高效性能。通過廣泛的離線和長期線上實驗,RecBot在用戶滿意度和業務成果方面均顯示出顯著的提升。
English
Traditional recommender systems rely on passive feedback mechanisms that limit users to simple choices such as like and dislike. However, these coarse-grained signals fail to capture users' nuanced behavior motivations and intentions. In turn, current systems cannot also distinguish which specific item attributes drive user satisfaction or dissatisfaction, resulting in inaccurate preference modeling. These fundamental limitations create a persistent gap between user intentions and system interpretations, ultimately undermining user satisfaction and harming system effectiveness. To address these limitations, we introduce the Interactive Recommendation Feed (IRF), a pioneering paradigm that enables natural language commands within mainstream recommendation feeds. Unlike traditional systems that confine users to passive implicit behavioral influence, IRF empowers active explicit control over recommendation policies through real-time linguistic commands. To support this paradigm, we develop RecBot, a dual-agent architecture where a Parser Agent transforms linguistic expressions into structured preferences and a Planner Agent dynamically orchestrates adaptive tool chains for on-the-fly policy adjustment. To enable practical deployment, we employ simulation-augmented knowledge distillation to achieve efficient performance while maintaining strong reasoning capabilities. Through extensive offline and long-term online experiments, RecBot shows significant improvements in both user satisfaction and business outcomes.
PDF52September 26, 2025