RecGPT 기술 보고서

초록

추천 시스템은 사용자, 판매자, 플랫폼을 연결하는 중요한 인프라로서, 인공지능의 가장 영향력 있는 응용 분야 중 하나입니다. 그러나 현재 대부분의 산업용 시스템은 여전히 과거의 동시 발생 패턴과 로그 피팅 목표, 즉 사용자 의도를 명시적으로 모델링하지 않고 과거 사용자 상호작용을 최적화하는 방식에 크게 의존하고 있습니다. 이러한 로그 피팅 접근 방식은 좁은 범위의 과거 선호도에 과적합되는 경향이 있어 사용자의 변화하는 잠재적 관심사를 포착하지 못합니다. 결과적으로, 이는 필터 버블과 롱테일 현상을 강화하며, 궁극적으로 사용자 경험을 해치고 전체 추천 생태계의 지속 가능성을 위협합니다. 이러한 문제를 해결하기 위해, 우리는 추천 시스템의 전반적인 설계 패러다임을 재고하고 사용자 의도를 추천 파이프라인의 중심에 두는 차세대 프레임워크인 RecGPT를 제안합니다. RecGPT는 대규모 언어 모델(LLM)을 사용자 관심사 탐색, 아이템 검색, 설명 생성의 주요 단계에 통합함으로써 로그 피팅 기반 추천을 의도 중심 프로세스로 전환합니다. 범용 LLM을 위와 같은 도메인 특화 추천 작업에 효과적으로 대규모로 정렬하기 위해, RecGPT는 인간-LLM 협력 판단 시스템의 지도 하에 추론 강화 사전 정렬과 자기 훈련 진화를 통합한 다단계 훈련 패러다임을 도입합니다. 현재 RecGPT는 타오바오 앱에 완전히 배포되었습니다. 온라인 실험 결과, RecGPT는 모든 이해관계자에게 일관된 성능 향상을 달성함을 보여줍니다: 사용자는 콘텐츠 다양성과 만족도 증가의 혜택을, 판매자와 플랫폼은 더 큰 노출과 전환율의 혜택을 얻습니다. 이러한 모든 이해관계자에 걸친 포괄적인 개선 결과는 LLM 기반의 의도 중심 설계가 더 지속 가능하고 상호 이익을 창출하는 추천 생태계를 조성할 수 있음을 검증합니다.

English

Recommender systems are among the most impactful applications of artificial intelligence, serving as critical infrastructure connecting users, merchants, and platforms. However, most current industrial systems remain heavily reliant on historical co-occurrence patterns and log-fitting objectives, i.e., optimizing for past user interactions without explicitly modeling user intent. This log-fitting approach often leads to overfitting to narrow historical preferences, failing to capture users' evolving and latent interests. As a result, it reinforces filter bubbles and long-tail phenomena, ultimately harming user experience and threatening the sustainability of the whole recommendation ecosystem. To address these challenges, we rethink the overall design paradigm of recommender systems and propose RecGPT, a next-generation framework that places user intent at the center of the recommendation pipeline. By integrating large language models (LLMs) into key stages of user interest mining, item retrieval, and explanation generation, RecGPT transforms log-fitting recommendation into an intent-centric process. To effectively align general-purpose LLMs to the above domain-specific recommendation tasks at scale, RecGPT incorporates a multi-stage training paradigm, which integrates reasoning-enhanced pre-alignment and self-training evolution, guided by a Human-LLM cooperative judge system. Currently, RecGPT has been fully deployed on the Taobao App. Online experiments demonstrate that RecGPT achieves consistent performance gains across stakeholders: users benefit from increased content diversity and satisfaction, merchants and the platform gain greater exposure and conversions. These comprehensive improvement results across all stakeholders validates that LLM-driven, intent-centric design can foster a more sustainable and mutually beneficial recommendation ecosystem.

RecGPT 기술 보고서

RecGPT Technical Report

초록

Support