QP-OneModel:面向小红书搜索多任务查询理解的统一生成式大语言模型
QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search
February 10, 2026
作者: Jianzhao Huang, Xiaorui Huang, Fei Zhao, Yunpeng Liu, Hui Zhang, Fangcheng Shi, Congfeng Li, Zechen Sun, Yi Wu, Yao Hu, Yunhan Bai, Shaosheng Cao
cs.AI
摘要
查询处理(QP)技术在大规模社交网络服务(SNS)搜索引擎中承担着连接用户意图与内容供给的关键作用。传统QP系统依赖相互独立的判别式模型(如BERT)构建流水线,存在语义理解能力有限和维护成本高的问题。尽管大语言模型(LLMs)提供了潜在解决方案,但现有方法往往孤立优化子任务,忽视了内在的语义协同效应,且需要独立迭代。此外,标准生成方法通常缺乏对SNS场景的针对性,难以弥合开放域语料与非正式SNS语言特征之间的差异,同时难以满足严格的业务定义要求。我们提出QP-OneModel——面向SNS领域的多任务查询理解统一生成式大模型。通过将异构子任务重构为统一的序列生成范式,采用渐进式三阶段对齐策略并结合多奖励强化学习,该模型能生成意图描述作为新型高保真语义信号,有效增强查询改写和排序等下游任务。离线评估表明,QP-OneModel相比判别式基线实现整体性能提升7.35%,命名实体识别(NER)和术语加权(Term Weighting)的F1分数分别显著提高9.01%和9.31%。在未知任务测试中,其准确率较32B参数模型提升7.60%,展现出卓越的泛化能力。该模型已在小红书全面部署,在线A/B测试证实其工业价值:检索相关性(DCG)优化0.21%,用户留存率提升0.044%。
English
Query Processing (QP) bridges user intent and content supply in large-scale Social Network Service (SNS) search engines. Traditional QP systems rely on pipelines of isolated discriminative models (e.g., BERT), suffering from limited semantic understanding and high maintenance overhead. While Large Language Models (LLMs) offer a potential solution, existing approaches often optimize sub-tasks in isolation, neglecting intrinsic semantic synergy and necessitating independent iterations. Moreover, standard generative methods often lack grounding in SNS scenarios, failing to bridge the gap between open-domain corpora and informal SNS linguistic patterns, while struggling to adhere to rigorous business definitions. We present QP-OneModel, a Unified Generative LLM for Multi-Task Query Understanding in the SNS domain. We reformulate heterogeneous sub-tasks into a unified sequence generation paradigm, adopting a progressive three-stage alignment strategy culminating in multi-reward Reinforcement Learning. Furthermore, QP-OneModel generates intent descriptions as a novel high-fidelity semantic signal, effectively augmenting downstream tasks such as query rewriting and ranking. Offline evaluations show QP-OneModel achieves a 7.35% overall gain over discriminative baselines, with significant F1 boosts in NER (+9.01%) and Term Weighting (+9.31%). It also exhibits superior generalization, surpassing a 32B model by 7.60% accuracy on unseen tasks. Fully deployed at Xiaohongshu, online A/B tests confirm its industrial value, optimizing retrieval relevance (DCG) by 0.21% and lifting user retention by 0.044%.