Dichtheidsbewuste zachte contextcompressie met semi-dynamische compressieverhouding

Samenvatting

Zachte contextcompressie vermindert de computationele werklast voor het verwerken van lange contexten in LLM's door lange context te coderen in een kleiner aantal latente tokens. Bestaande frameworks passen echter uniforme compressieverhoudingen toe, zonder rekening te houden met de extreme variatie in informatiedichtheid van natuurlijke taal. Hoewel het gebruik van een dynamische, op dichtheid afgestemde compressieverhouding intuïtief lijkt, tonen empirische onderzoeken aan dat modellen intrinsieke moeite hebben met bewerkingen die geparametriseerd worden door invoerafhankelijke, continue structurele hyperparameters. Om deze valkuil op te lossen, introduceren wij het Semi-Dynamisch Contextcompressie-framework. Onze aanpak omvat een Discrete Ratio Selector, die een compressiedoel voorspelt op basis van de intrinsieke informatiedichtheid en deze kwantiseert naar een vooraf gedefinieerde set van discrete compressieverhoudingen. Deze wordt efficiënt gezamenlijk getraind met de compressor op synthetische data, waarbij de samenvattingslengtes als proxy dienen om labels te creëren voor de voorspelling van de compressieverhouding. Uitgebreide evaluaties bevestigen dat ons op dichtheid afgestemd framework, dat mean pooling als backbone gebruikt, consistent beter presteert dan statische baseline-methoden, en daarmee een robuust Pareto-frontier vestigt voor contextcompressietechnieken. Onze code, data en modelgewichten zijn beschikbaar op https://github.com/yuyijiong/semi-dynamic-context-compress.

English

Soft context compression reduces the computational workload of processing long contexts in LLMs by encoding long context into a smaller number of latent tokens. However, existing frameworks apply uniform compression ratios, failing to account for the extreme variance in natural language information density. While adopting a density-aware dynamic compression ratio seems intuitive, empirical investigations reveal that models struggle intrinsically with operations parameterized by input dependent, continuous structural hyperparameters. To resolve this pitfall, we introduce Semi-Dynamic Context Compression framework. Our approach features a Discrete Ratio Selector, which predicts a compression target based on intrinsic information density and quantizes it to a predefined set of discrete compression ratios. It is efficiently jointly trained with the compressor on synthetic data, with the summary lengths as a proxy to create labels for compression ratio prediction. Extensive evaluations confirm that our density-aware framework, utilizing mean pooling as the backbone, consistently outperforms static baselines, establishing a robust Pareto frontier for context compression techniques. Our code, data and model weights are available at https://github.com/yuyijiong/semi-dynamic-context-compress

Dichtheidsbewuste zachte contextcompressie met semi-dynamische compressieverhouding

Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio

Samenvatting

Support