OneRank: 멀티태스크 추천을 위한 통합 트랜스포머 네이티브 랭킹 아키텍처

초록

다중 작업 학습(MTL)은 다양한 사용자 피드백 간의 상호 보완적 학습을 가능하게 하기 위해 추천 시스템에서 필수적이다. 현대 산업 관행은 DNN에서 트랜스포머 중심 아키텍처로 전환하여 시퀀스 모델링과 확장 능력을 강화했지만, 여전히 특징 인코딩을 다중 작업 예측에서 분리하여 트랜스포머를 작업에 구애받지 않는 인코더로 취급한다. 이러한 설계는 (1) 이질적 작업 목표 하에서 정보 병목 현상을 생성하고, (2) 그래디언트 간섭을 유발하여 시소 현상을 초래하며, (3) 어텐션 기반의 맥락 적응적 표현 학습을 호환되지 않는 정보 읽기-쓰기 동역학을 가진 정적 피드포워드 작업 예측으로 전환하는 데이터 흐름 변환을 강제함으로써 성능과 확장성을 근본적으로 제한한다. 본 논문에서는 인코더-예측기 분리를 제거하고 순방향 표현 학습과 역방향 최적화를 위한 작업 전용 채널을 도입하여 작업 특화 학습을 가능하게 하면서 작업 간 간섭을 줄이는 트랜스포머 고유의 다중 작업 랭킹 프레임워크인 OneRank를 제안한다. 순방향 패스에서 OneRank는 작업 조건화된 정보 선택, 후보 인식 맥락화, 제어된 교차 작업 상호작용을 통해 상향식으로 작업 특정 표현을 학습한다. 역방향 패스에서는 교차 작업 그래디언트 분리가 작업 전용 파라미터 업데이트를 공유 지식 추출 모듈로부터 격리시켜 부정적 전이를 방지한다. 또한 정적 작업별 MLP 스코어러를 동적 매칭 기반 스코어링으로 대체하여 맥락 인식 개인화 랭킹을 구현한다. 트랜스포머 스택 내에 다중 작업 추론을 내재화함으로써 OneRank는 통합되고 확장 가능한 아키텍처 패러다임을 수립한다. 대규모 산업 데이터셋에 대한 오프라인 및 온라인 실험에서 OneRank는 계산 효율성을 유지하면서 최첨단 기준선을 크게 능가함을 보여준다.

English

Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.