ChatPaper.aiChatPaper

KV-CoRE:基准测试LLM中KV缓存的数据相关低秩可压缩性

KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs

February 5, 2026
作者: Jian Chen, Zhuoran Wang, Jiayu Qin, Ming Li, Meng Wang, Changyou Chen, Yin Chen, Qizhen Weng, Yirui Liu
cs.AI

摘要

大型语言模型依赖键值缓存(kv-cache)来避免自回归解码过程中的冗余计算,但随着上下文长度增加,缓存的读写操作会迅速达到GPU内存带宽上限。近期研究虽已探索键值缓存压缩技术,但多数方法忽略了缓存的数据依赖性特征及其在不同网络层间的差异性。我们提出KV-CoRE(基于秩评估的键值缓存可压缩性),这是一种基于奇异值分解(SVD)的方法,用于量化键值缓存中数据相关的低秩可压缩性。该方法通过弗罗贝尼乌斯范数计算最优低秩近似,且无需梯度计算并支持增量处理,可实现高效的数据集级分层评估。基于此方法,我们分析了涵盖五大英语领域和十六种语言的多类模型与数据集,揭示了可压缩性与模型架构、训练数据及语言覆盖范围之间的系统性关联规律。在此分析过程中,我们采用归一化有效秩作为可压缩性度量指标,并证明其与压缩下的性能衰减高度相关。本研究建立了键值缓存可压缩性的理论评估框架和首个大规模基准测试,为动态感知数据的压缩技术和以数据为中心的模型开发提供了新思路。
English
Large language models rely on kv-caches to avoid redundant computation during autoregressive decoding, but as context length grows, reading and writing the cache can quickly saturate GPU memory bandwidth. Recent work has explored KV-cache compression, yet most approaches neglect the data-dependent nature of kv-caches and their variation across layers. We introduce KV-CoRE KV-cache Compressibility by Rank Evaluation), an SVD-based method for quantifying the data-dependent low-rank compressibility of kv-caches. KV-CoRE computes the optimal low-rank approximation under the Frobenius norm and, being gradient-free and incremental, enables efficient dataset-level, layer-wise evaluation. Using this method, we analyze multiple models and datasets spanning five English domains and sixteen languages, uncovering systematic patterns that link compressibility to model architecture, training data, and language coverage. As part of this analysis, we employ the Normalized Effective Rank as a metric of compressibility and show that it correlates strongly with performance degradation under compression. Our study establishes a principled evaluation framework and the first large-scale benchmark of kv-cache compressibility in LLMs, offering insights for dynamic, data-aware compression and data-centric model development.
PDF12February 11, 2026