소형 언어 모델에서 추론 유도를 위한 KV 캐시 스티어링

초록

우리는 키-값 캐시에 직접 적용되는 원샷 개입을 통해 언어 모델을 암묵적으로 조종하는 경량화된 방법인 캐시 스티어링을 제안한다. 이를 검증하기 위해, 캐시 스티어링을 소규모 언어 모델에 적용하여 사고의 연쇄적 추론을 유도한다. 우리의 접근 방식은 GPT-4o가 생성한 추론 흔적을 활용하여, 미세 조정이나 프롬프트 수정 없이도 모델의 행동을 더 명시적이고 다단계적인 추론으로 전환하는 스티어링 벡터를 구성한다. 다양한 추론 벤치마크에 대한 실험적 평가는 캐시 스티어링이 모델 추론의 질적 구조와 과제 수행의 양적 성과 모두를 개선함을 보여준다. 지속적인 개입이 필요한 기존의 활성화 스티어링 기법과 비교했을 때, 원샷 캐시 스티어링은 하이퍼파라미터 안정성, 추론 시간 효율성, 통합 용이성 측면에서 상당한 이점을 제공하며, 이는 제어된 생성을 위한 더 견고하고 실용적인 솔루션으로 자리 잡는다.

English

We propose cache steering, a lightweight method for implicit steering of language models via a one-shot intervention applied directly to the key-value cache. To validate its effectiveness, we apply cache steering to induce chain-of-thought reasoning in small language models. Our approach leverages GPT-4o-generated reasoning traces to construct steering vectors that shift model behavior toward more explicit, multi-step reasoning without fine-tuning or prompt modifications. Experimental evaluations on diverse reasoning benchmarks demonstrate that cache steering improves both the qualitative structure of model reasoning and quantitative task performance. Compared to prior activation steering techniques that require continuous interventions, our one-shot cache steering offers substantial advantages in terms of hyperparameter stability, inference-time efficiency, and ease of integration, making it a more robust and practical solution for controlled generation.

소형 언어 모델에서 추론 유도를 위한 KV 캐시 스티어링

KV Cache Steering for Inducing Reasoning in Small Language Models

초록

Support