ChatPaper.aiChatPaper

QP-OneModel:面向小红书搜索多任务查询理解的统一生成式大语言模型

QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

February 10, 2026
作者: Jianzhao Huang, Xiaorui Huang, Fei Zhao, Yunpeng Liu, Hui Zhang, Fangcheng Shi, Congfeng Li, Zechen Sun, Yi Wu, Yao Hu, Yunhan Bai, Shaosheng Cao
cs.AI

摘要

查询处理(QP)在大规模社交网络服务(SNS)搜索引擎中连接用户意图与内容供给。传统QP系统依赖孤立判别式模型(如BERT)的流水线架构,存在语义理解局限和维护成本高的问题。尽管大语言模型(LLMs)提供了潜在解决方案,现有方法往往孤立优化子任务,忽视了内在语义协同性且需独立迭代。此外,标准生成方法常缺乏SNS场景根基,难以弥合开放域语料与非正式SNS语言特征之间的差异,同时难以遵循严谨的业务定义。我们提出QP-OneModel——面向SNS领域多任务查询理解的统一生成式大语言模型。通过将异构子任务重构为统一的序列生成范式,采用渐进式三阶段对齐策略,最终结合多奖励强化学习。该模型创新性地生成意图描述作为高保真语义信号,有效增强查询改写、排序等下游任务。离线评估表明,QP-OneModel相比判别式基线实现7.35%的综合性能提升,命名实体识别(+9.01%)和术语权重计算(+9.31%)的F1值显著提高。在未见任务上其准确率超越320亿参数模型7.60%,展现出卓越泛化能力。该模型已在小红书全面部署,在线A/B测试验证其工业价值:检索相关性(DCG)提升0.21%,用户留存率增加0.044%。
English
Query Processing (QP) bridges user intent and content supply in large-scale Social Network Service (SNS) search engines. Traditional QP systems rely on pipelines of isolated discriminative models (e.g., BERT), suffering from limited semantic understanding and high maintenance overhead. While Large Language Models (LLMs) offer a potential solution, existing approaches often optimize sub-tasks in isolation, neglecting intrinsic semantic synergy and necessitating independent iterations. Moreover, standard generative methods often lack grounding in SNS scenarios, failing to bridge the gap between open-domain corpora and informal SNS linguistic patterns, while struggling to adhere to rigorous business definitions. We present QP-OneModel, a Unified Generative LLM for Multi-Task Query Understanding in the SNS domain. We reformulate heterogeneous sub-tasks into a unified sequence generation paradigm, adopting a progressive three-stage alignment strategy culminating in multi-reward Reinforcement Learning. Furthermore, QP-OneModel generates intent descriptions as a novel high-fidelity semantic signal, effectively augmenting downstream tasks such as query rewriting and ranking. Offline evaluations show QP-OneModel achieves a 7.35% overall gain over discriminative baselines, with significant F1 boosts in NER (+9.01%) and Term Weighting (+9.31%). It also exhibits superior generalization, surpassing a 32B model by 7.60% accuracy on unseen tasks. Fully deployed at Xiaohongshu, online A/B tests confirm its industrial value, optimizing retrieval relevance (DCG) by 0.21% and lifting user retention by 0.044%.
PDF61February 13, 2026