RecGPT技術レポート

要旨

レコメンダーシステムは、人工知能の最も影響力のある応用分野の一つであり、ユーザー、販売者、プラットフォームを結びつける重要なインフラとして機能しています。しかし、現在の産業システムの多くは、依然として過去の共起パターンやログフィッティング目的、すなわちユーザーの意図を明示的にモデル化せずに過去のユーザーインタラクションを最適化することに大きく依存しています。このログフィッティングアプローチは、狭い歴史的嗜好に過剰適合しがちで、ユーザーの進化する潜在的な興味を捉えることに失敗します。その結果、フィルターバブルやロングテール現象を強化し、最終的にはユーザーエクスペリエンスを損ない、レコメンデーションエコシステム全体の持続可能性を脅かすことになります。これらの課題に対処するため、私たちはレコメンダーシステムの全体的な設計パラダイムを再考し、ユーザー意図をレコメンデーションパイプラインの中心に据えた次世代フレームワークであるRecGPTを提案します。RecGPTは、大規模言語モデル（LLM）をユーザー興味のマイニング、アイテム検索、説明生成の主要な段階に統合することで、ログフィッティングレコメンデーションを意図中心のプロセスに変革します。汎用LLMを上記のドメイン固有のレコメンデーションタスクに効果的かつ大規模に適合させるために、RecGPTは、推論を強化した事前アラインメントと自己学習進化を統合した多段階トレーニングパラダイムを採用し、Human-LLM協調判断システムによってガイドされます。現在、RecGPTはTaobaoアプリに完全に導入されています。オンライン実験では、RecGPTがすべてのステークホルダーにわたって一貫したパフォーマンス向上を達成することが示されています。ユーザーはコンテンツの多様性と満足度の向上を享受し、販売者とプラットフォームはより大きな露出とコンバージョンを得ています。これらの包括的な改善結果は、LLM駆動の意図中心設計が、より持続可能で相互に有益なレコメンデーションエコシステムを育むことができることを検証しています。

English

Recommender systems are among the most impactful applications of artificial intelligence, serving as critical infrastructure connecting users, merchants, and platforms. However, most current industrial systems remain heavily reliant on historical co-occurrence patterns and log-fitting objectives, i.e., optimizing for past user interactions without explicitly modeling user intent. This log-fitting approach often leads to overfitting to narrow historical preferences, failing to capture users' evolving and latent interests. As a result, it reinforces filter bubbles and long-tail phenomena, ultimately harming user experience and threatening the sustainability of the whole recommendation ecosystem. To address these challenges, we rethink the overall design paradigm of recommender systems and propose RecGPT, a next-generation framework that places user intent at the center of the recommendation pipeline. By integrating large language models (LLMs) into key stages of user interest mining, item retrieval, and explanation generation, RecGPT transforms log-fitting recommendation into an intent-centric process. To effectively align general-purpose LLMs to the above domain-specific recommendation tasks at scale, RecGPT incorporates a multi-stage training paradigm, which integrates reasoning-enhanced pre-alignment and self-training evolution, guided by a Human-LLM cooperative judge system. Currently, RecGPT has been fully deployed on the Taobao App. Online experiments demonstrate that RecGPT achieves consistent performance gains across stakeholders: users benefit from increased content diversity and satisfaction, merchants and the platform gain greater exposure and conversions. These comprehensive improvement results across all stakeholders validates that LLM-driven, intent-centric design can foster a more sustainable and mutually beneficial recommendation ecosystem.

RecGPT技術レポート

RecGPT Technical Report

要旨

Support