OneRank: 統一Transformer原生排序架構用於多任務推薦

摘要

多任務學習（MTL）對於推薦系統中實現多樣化用戶回饋間的互補學習至關重要。儘管現代工業實務已從深度神經網路轉向以Transformer為核心的架構，以強化序列建模與擴展能力，但其仍將特徵編碼與多任務預測分離，並將Transformer視為任務無關的編碼器。這種設計從根本上限制了效能與可擴展性，原因在於：（1）在異質任務目標下形成資訊瓶頸；（2）引發梯度干擾，導致蹺蹺板現象；（3）迫使資料流轉換，使基於注意力機制、情境自適應的表徵學習轉變為靜態的前饋任務預測，並伴隨不相容的資訊讀寫動態。我們提出OneRank，一個原生Transformer的多任務排序框架，該框架消除了編碼器與預測器的分離，並引入任務私有通道，用於前向表徵學習與反向優化，從而實現任務專門化學習，同時減少任務間干擾。在前向傳遞中，OneRank透過任務條件化資訊選擇、候選項感知情境化以及可控的跨任務互動，自底向上學習任務特定表徵。在反向傳遞中，跨任務梯度分離將任務私有參數更新與共享知識提取模組隔離，防止負遷移。我們進一步將靜態的任務特定多層感知機評分器替換為基於動態匹配的評分機制，以實現情境感知的個人化排序。透過將多任務推理內化於Transformer堆疊之中，OneRank建立了一個統一且可擴展的架構範式。在大規模工業資料集上的離線與線上實驗結果表明，OneRank在維持計算效率的同時，顯著優於當前最先進的基準方法。

English

Multi-task learning (MTL) is essential in recommender systems to enable complementary learning among diverse user feedback. While modern industrial practices have shifted from DNNs to Transformer-centric architectures to strengthen sequence modeling and scaling capacity, they still decouple feature encoding from multi-task prediction, treating the Transformer as a task-agnostic encoder. This design fundamentally limits the performance and scalability by (1) creating an information bottleneck under heterogeneous task objectives, (2) inducing gradient interference that leads to the seesaw phenomenon, and (3) forcing a dataflow transition in which attention-based, context-adaptive representation learning is converted to static feed-forward task prediction with incompatible information read-write dynamics. We propose OneRank, a Transformer-native multi-task ranking framework that eliminates encoder-predictor separation and introduces task-private channels for forward representation learning and backward optimization, enabling task-specialized learning while reducing inter-task interference. In the forward pass, OneRank learns task-specific representations bottom-up through task-conditioned information selection, candidate-aware contextualization, and controlled cross-task interaction. In the backward pass, cross-task gradient detachment isolates task-private parameter updates from shared knowledge extraction modules, preventing negative transfer. We further replace static task-specific MLP scorers with dynamic matching-based scoring for context-aware personalized ranking. By internalizing multi-task reasoning within the Transformer stack, OneRank establishes a unified and scalable architectural paradigm. Offline and online experiments on large-scale industrial datasets show that OneRank significantly outperforms state-of-the-art baselines while maintaining computational efficiency.