推理模型是否提升了嵌入模型的性能?
Do Reasoning Models Enhance Embedding Models?
January 29, 2026
作者: Wun Yu Chan, Shaojin Chen, Huihao Jing, Kwun Hang Lau, Elton Chun-Chai Li, Zihao Wang, Haoran Li, Yangqiu Song
cs.AI
摘要
当前最先进的嵌入模型越来越多地源自经过对比学习调整的仅解码器大语言模型(LLM)骨架。随着基于可验证奖励强化学习(RLVR)训练的逻辑推理模型的出现,一个自然问题随之产生:当这些模型作为嵌入初始化时,增强的推理能力是否能转化为更优越的语义表示?与预期相反,我们在MTEB和BRIGHT基准上的评估揭示了**零效应**:采用相同训练方案时,基于RLVR调优骨架初始化的嵌入模型并未持续优于其基础对应模型。为解析这一悖论,我们提出**层次化表示相似性分析(HRSA)**框架,该框架从表示层、几何层和功能层三个维度解构相似性。HRSA分析表明,虽然RLVR会引发潜在流形局部几何结构的不可逆重组与可逆的坐标基漂移,但全局流形几何结构与线性解码能力得以保持。因此,后续的对比学习会驱动基础模型与推理初始化模型之间产生强对齐,这一现象我们称之为**流形重对齐**。实证研究表明,与监督微调(SFT)不同,RLVR是在现有语义空间内优化轨迹,而非从根本上重构语义空间本身。
English
State-of-the-art embedding models are increasingly derived from decoder-only Large Language Model (LLM) backbones adapted via contrastive learning. Given the emergence of reasoning models trained via Reinforcement Learning with Verifiable Rewards (RLVR), a natural question arises: do enhanced reasoning translate to superior semantic representations when these models serve as embedding initializations? Contrary to expectation, our evaluation on MTEB and BRIGHT reveals a **null effect**: embedding models initialized from RLVR-tuned backbones yield no consistent performance advantage over their base counterparts when subjected to identical training recipes. To unpack this paradox, we introduce **H**ierarchical **R**epresentation **S**imilarity **A**nalysis (HRSA), a framework that decomposes similarity across representation, geometry, and function levels. HRSA reveals that while RLVR induces irreversible latent manifold's local geometry reorganization and reversible coordinate basis drift, it preserves the global manifold geometry and linear readout. Consequently, subsequent contrastive learning drives strong alignment between base- and reasoning-initialized models, a phenomenon we term **Manifold Realignment**. Empirically, our findings suggest that unlike Supervised Fine-Tuning (SFT), RLVR optimizes trajectories within an existing semantic landscape rather than fundamentally restructuring the landscape itself.