OnePiece:将上下文工程与推理引入工业级级联排序系统
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System
September 22, 2025
作者: Sunhao Dai, Jiakai Tang, Jiahua Wu, Kun Wang, Yuxuan Zhu, Bingjun Chen, Bangyang Hong, Yu Zhao, Cong Fu, Kangle Wu, Yabo Ni, Anxiang Zeng, Wenjie Wang, Xu Chen, Jun Xu, See-Kiong Ng
cs.AI
摘要
尽管业界对在工业搜索和推荐系统中复制大规模语言模型(LLMs)成功经验的兴趣日益增长,但大多数现有的工业实践仍局限于移植Transformer架构,这仅能在强大的深度学习推荐模型(DLRMs)基础上带来渐进式改进。从基本原理来看,LLMs的突破不仅源于其架构,还得益于两种互补机制:上下文工程,通过丰富原始输入查询的上下文线索以更好地激发模型能力;以及多步推理,通过中间推理路径迭代优化模型输出。然而,在工业排序系统中,这两种机制及其释放显著改进潜力的可能性仍大多未被充分探索。
本文中,我们提出了OnePiece,一个统一框架,将LLM风格的上下文工程和推理无缝集成到工业级级联管道的检索与排序模型中。OnePiece基于纯Transformer架构,并进一步引入了三项关键创新:(1)结构化上下文工程,通过偏好和场景信号增强交互历史,并将其统一为结构化、令牌化的输入序列,适用于检索与排序;(2)分块潜在推理,赋予模型多步表示精炼能力,并通过块大小扩展推理带宽;(3)渐进式多任务训练,利用用户反馈链有效监督训练过程中的推理步骤。OnePiece已在Shopee的主要个性化搜索场景中部署,并在不同关键业务指标上实现了持续的在线增益,包括超过+2%的GMV/UU和+2.90%的广告收入增长。
English
Despite the growing interest in replicating the scaled success of large
language models (LLMs) in industrial search and recommender systems, most
existing industrial efforts remain limited to transplanting Transformer
architectures, which bring only incremental improvements over strong Deep
Learning Recommendation Models (DLRMs). From a first principle perspective, the
breakthroughs of LLMs stem not only from their architectures but also from two
complementary mechanisms: context engineering, which enriches raw input queries
with contextual cues to better elicit model capabilities, and multi-step
reasoning, which iteratively refines model outputs through intermediate
reasoning paths. However, these two mechanisms and their potential to unlock
substantial improvements remain largely underexplored in industrial ranking
systems.
In this paper, we propose OnePiece, a unified framework that seamlessly
integrates LLM-style context engineering and reasoning into both retrieval and
ranking models of industrial cascaded pipelines. OnePiece is built on a pure
Transformer backbone and further introduces three key innovations: (1)
structured context engineering, which augments interaction history with
preference and scenario signals and unifies them into a structured tokenized
input sequence for both retrieval and ranking; (2) block-wise latent reasoning,
which equips the model with multi-step refinement of representations and scales
reasoning bandwidth via block size; (3) progressive multi-task training, which
leverages user feedback chains to effectively supervise reasoning steps during
training. OnePiece has been deployed in the main personalized search scenario
of Shopee and achieves consistent online gains across different key business
metrics, including over +2% GMV/UU and a +2.90% increase in advertising
revenue.