推理模型是否提升了嵌入模型的效能?
Do Reasoning Models Enhance Embedding Models?
January 29, 2026
作者: Wun Yu Chan, Shaojin Chen, Huihao Jing, Kwun Hang Lau, Elton Chun-Chai Li, Zihao Wang, Haoran Li, Yangqiu Song
cs.AI
摘要
當前最先進的嵌入模型,越來越多源自僅解碼器架構的大型語言模型(LLM)骨幹,並透過對比學習進行適應性調整。隨著基於可驗證獎勵強化學習(RLVR)訓練的推理模型出現,一個自然產生的問題是:當這些模型作為嵌入初始化時,增強的推理能力是否會轉化為更優異的語義表徵?與預期相反,我們在MTEB和BRIGHT基準上的評估顯示出「零效應」現象:採用相同訓練方法時,以RLVR調校後的骨幹作為初始化的嵌入模型,並未相較其基礎對照模型產生一致的性能優勢。為解析此矛盾,我們提出「層級表徵相似性分析(HRSA)」框架,該框架將相似性分解為表徵層、幾何層和功能層三個維度。HRSA分析表明,雖然RLVR會引發潛在流形局部幾何的不可重組化改變與可逆的座標基漂移,但同時保留了全局流形幾何結構與線性讀出能力。因此,後續的對比學習會驅動基礎模型與推理初始化模型之間產生強烈對齊,此現象我們稱之為「流形重對齊」。實證結果顯示,與監督式微調(SFT)不同,RLVR實質上是在既有語義空間內優化軌跡,而非從根本上重構該空間本身。
English
State-of-the-art embedding models are increasingly derived from decoder-only Large Language Model (LLM) backbones adapted via contrastive learning. Given the emergence of reasoning models trained via Reinforcement Learning with Verifiable Rewards (RLVR), a natural question arises: do enhanced reasoning translate to superior semantic representations when these models serve as embedding initializations? Contrary to expectation, our evaluation on MTEB and BRIGHT reveals a **null effect**: embedding models initialized from RLVR-tuned backbones yield no consistent performance advantage over their base counterparts when subjected to identical training recipes. To unpack this paradox, we introduce **H**ierarchical **R**epresentation **S**imilarity **A**nalysis (HRSA), a framework that decomposes similarity across representation, geometry, and function levels. HRSA reveals that while RLVR induces irreversible latent manifold's local geometry reorganization and reversible coordinate basis drift, it preserves the global manifold geometry and linear readout. Consequently, subsequent contrastive learning drives strong alignment between base- and reasoning-initialized models, a phenomenon we term **Manifold Realignment**. Empirically, our findings suggest that unlike Supervised Fine-Tuning (SFT), RLVR optimizes trajectories within an existing semantic landscape rather than fundamentally restructuring the landscape itself.