GAPrune:面向领域感知嵌入的梯度对齐剪枝算法
GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings
September 13, 2025
作者: Yixuan Tang, Yi Yang
cs.AI
摘要
领域特定嵌入模型在需要专业语义理解的应用中展现出巨大潜力,例如代码助手和金融检索系统,通常比通用模型能获得更高的性能提升。然而,最先进的嵌入模型通常基于包含数十亿参数的大语言模型(LLMs),这在资源受限的环境中部署面临挑战。通过剪枝进行模型压缩提供了一种有前景的解决方案,但现有的剪枝方法对所有参数一视同仁,未能区分通用语义表示与领域特定模式,导致剪枝决策不够优化。为此,我们提出了GAPrune,一个剪枝框架,通过同时考虑领域重要性和保留通用语言基础来解决这一挑战。我们的方法利用费舍尔信息衡量重要性,并通过通用领域梯度对齐评估参数行为,然后结合这些信号使用我们提出的领域对齐重要性(DAI)评分。较低的DAI分数表明该参数对领域任务的重要性较低,或在领域与通用目标之间产生冲突。在FinMTEB和ChemTEB两个领域基准上的实验表明,GAPrune在50%稀疏度的一次性剪枝中,性能保持在密集模型的2.5%以内,同时优于所有基线。经过100步的重新训练,GAPrune在FinMTEB上实现了+4.51%的提升,在ChemTEB上实现了+1.73%的提升,证明我们的剪枝策略不仅保留了还增强了领域特定能力。我们的研究结果表明,基于原则的剪枝策略能够实现模型压缩和增强的领域专业化,为研究社区提供了一种新的开发途径。
English
Domain-specific embedding models have shown promise for applications that
require specialized semantic understanding, such as coding agents and financial
retrieval systems, often achieving higher performance gains than general
models. However, state-of-the-art embedding models are typically based on LLMs,
which contain billions of parameters, making deployment challenging in
resource-constrained environments. Model compression through pruning offers a
promising solution, but existing pruning methods treat all parameters
uniformly, failing to distinguish between general semantic representations and
domain-specific patterns, leading to suboptimal pruning decisions. Thus, we
propose GAPrune, a pruning framework that addresses this challenge by
considering both domain importance and preserving general linguistic
foundation. Our method uses Fisher Information to measure importance and
general-domain gradient alignment to assess parameter behavior, then combines
these signals using our Domain Alignment Importance (DAI) scoring. Lower DAI
scores indicate that the parameter is either less important for the domain task
or creates conflicts between domain and general objectives. Experiments on two
domain benchmarks, FinMTEB and ChemTEB, show that GAPrune maintains performance
within 2.5% of dense models in one-shot pruning at 50% sparsity, while
outperforming all baselines. With retraining in 100 steps, GAPrune achieves
+4.51% improvement on FinMTEB and +1.73% on ChemTEB, demonstrating that our
pruning strategy not only preserves but enhances domain-specific capabilities.
Our findings demonstrate that principled pruning strategies can achieve model
compression and enhanced domain specialization, providing the research
community with a new approach for development.