OnePiece: コンテキストエンジニアリングと推論を産業用カスケードランキングシステムに導入

要旨

大規模言語モデル（LLMs）の成功を産業検索や推薦システムに再現することへの関心が高まる中、既存の産業界の取り組みの多くは、Transformerアーキテクチャの移植に留まっており、強力な深層学習推薦モデル（DLRMs）に対してわずかな改善しかもたらしていない。第一原理の観点から、LLMsのブレークスルーは、そのアーキテクチャだけでなく、2つの補完的なメカニズムに起因している。1つは、コンテキストエンジニアリングであり、生の入力クエリを文脈的な手がかりで豊かにし、モデルの能力をより良く引き出す。もう1つは、多段階推論であり、中間的な推論パスを通じてモデルの出力を反復的に洗練する。しかし、これらの2つのメカニズムとその潜在的な大幅な改善の可能性は、産業界のランキングシステムではほとんど未開拓のままである。本論文では、OnePieceという統一フレームワークを提案する。OnePieceは、LLMスタイルのコンテキストエンジニアリングと推論を、産業界のカスケードパイプラインの検索モデルとランキングモデルにシームレスに統合する。OnePieceは、純粋なTransformerバックボーンに基づいて構築され、さらに3つの主要な革新を導入している。(1) 構造化コンテキストエンジニアリング：インタラクション履歴を選好やシナリオ信号で拡張し、それらを構造化されたトークン化入力シーケンスとして検索とランキングの両方に統一する。(2) ブロック単位の潜在推論：モデルに表現の多段階洗練を可能にし、ブロックサイズを通じて推論帯域幅をスケーリングする。(3) 漸進的多タスク学習：ユーザーフィードバックチェーンを活用して、トレーニング中の推論ステップを効果的に監督する。OnePieceは、Shopeeの主要なパーソナライズド検索シナリオに導入され、GMV/UUで+2%以上、広告収益で+2.90%の増加など、さまざまな主要なビジネス指標で一貫したオンラインゲインを達成している。

English

Despite the growing interest in replicating the scaled success of large language models (LLMs) in industrial search and recommender systems, most existing industrial efforts remain limited to transplanting Transformer architectures, which bring only incremental improvements over strong Deep Learning Recommendation Models (DLRMs). From a first principle perspective, the breakthroughs of LLMs stem not only from their architectures but also from two complementary mechanisms: context engineering, which enriches raw input queries with contextual cues to better elicit model capabilities, and multi-step reasoning, which iteratively refines model outputs through intermediate reasoning paths. However, these two mechanisms and their potential to unlock substantial improvements remain largely underexplored in industrial ranking systems. In this paper, we propose OnePiece, a unified framework that seamlessly integrates LLM-style context engineering and reasoning into both retrieval and ranking models of industrial cascaded pipelines. OnePiece is built on a pure Transformer backbone and further introduces three key innovations: (1) structured context engineering, which augments interaction history with preference and scenario signals and unifies them into a structured tokenized input sequence for both retrieval and ranking; (2) block-wise latent reasoning, which equips the model with multi-step refinement of representations and scales reasoning bandwidth via block size; (3) progressive multi-task training, which leverages user feedback chains to effectively supervise reasoning steps during training. OnePiece has been deployed in the main personalized search scenario of Shopee and achieves consistent online gains across different key business metrics, including over +2% GMV/UU and a +2.90% increase in advertising revenue.

OnePiece: コンテキストエンジニアリングと推論を産業用カスケードランキングシステムに導入

OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System

要旨

Support