查詢為錨:基於大型語言模型的場景自適應使用者表徵
Query as Anchor: Scenario-Adaptive User Representation via Large Language Model
February 16, 2026
作者: Jiahao Yuan, Yike Xu, Jinyong Wen, Baokun Wang, Ziyi Gao, Xiaotong Lin, Yun Liu, Xing Fu, Yu Cheng, Yongchao Liu, Weiqiang Wang, Zhongle Xie
cs.AI
摘要
工業級用戶表徵學習需要平衡穩健的通用性與敏銳的任務敏感性。然而,現有範式主要產生靜態、任務無關的嵌入表示,難以在統一向量空間中調和下遊場景的差異化需求。此外,異構多源數據帶來的固有噪聲與模態衝突會進一步削弱表徵質量。我們提出「查詢為錨」框架,將用戶建模從靜態編碼轉向動態的查詢感知合成。為使大型語言模型具備深度用戶理解能力,我們首先構建UserU——一個對齊多模態行為序列與用戶語義理解的工業級預訓練數據集,並通過Q-Anchor嵌入架構將分層粗細粒度編碼器集成至雙塔式LLM,經由聯合對比-自回歸優化實現查詢感知的用戶表徵。為彌合通用預訓練與專屬業務邏輯間的鴻溝,我們進一步引入基於聚類的軟提示調優技術,強化判別性潛在結構,有效對齊模型注意力與場景特定模態。在部署層面,將錨定查詢置於序列末端可實現KV緩存加速推理,且增量延遲可忽略不計。在支付寶10個工業基準測試中,本方法均展現出持續的SOTA性能、強大擴展性及高效部署能力。支付寶生產環境中兩個真實場景的大規模線上A/B測試進一步驗證了其實用效能。代碼已準備公開並將發佈於:https://github.com/JhCircle/Q-Anchor。
English
Industrial-scale user representation learning requires balancing robust universality with acute task-sensitivity. However, existing paradigms primarily yield static, task-agnostic embeddings that struggle to reconcile the divergent requirements of downstream scenarios within unified vector spaces. Furthermore, heterogeneous multi-source data introduces inherent noise and modality conflicts, degrading representation. We propose Query-as-Anchor, a framework shifting user modeling from static encoding to dynamic, query-aware synthesis. To empower Large Language Models (LLMs) with deep user understanding, we first construct UserU, an industrial-scale pre-training dataset that aligns multi-modal behavioral sequences with user understanding semantics, and our Q-Anchor Embedding architecture integrates hierarchical coarse-to-fine encoders into dual-tower LLMs via joint contrastive-autoregressive optimization for query-aware user representation. To bridge the gap between general pre-training and specialized business logic, we further introduce Cluster-based Soft Prompt Tuning to enforce discriminative latent structures, effectively aligning model attention with scenario-specific modalities. For deployment, anchoring queries at sequence termini enables KV-cache-accelerated inference with negligible incremental latency. Evaluations on 10 Alipay industrial benchmarks show consistent SOTA performance, strong scalability, and efficient deployment. Large-scale online A/B testing in Alipay's production system across two real-world scenarios further validates its practical effectiveness. Our code is prepared for public release and will be available at: https://github.com/JhCircle/Q-Anchor.