모든 거대 활성화를 설명하는 하나의 층: 대규모 언어 모델에서의 거대 활성화 이해

초록

본 연구는 대규모 언어 모델(LLM)에서 나타나는 대규모 활성화의 기원을 조사하고, 모델 계열 전반에서 일관적으로 관찰되는 특정 계층, 즉 대규모 출현 계층(ME Layer)을 식별한다. 이 계층에서 대규모 활성화가 처음 출현한 후 잔차 연결을 통해 더 깊은 계층으로 전파된다. ME 계층 내에서 RMSNorm과 FFN 매개변수가 모두 대규모 활성화의 출현에 공동으로 기여함을 보여준다. 일단 형성되면, 대규모 활성화 토큰 표현은 계층 간에 크게 불변성을 유지하며, 주의 모듈로 전달되는 은닉 표현의 다양성을 감소시킨다. 이러한 한계에 착안하여, 대규모 활성화 토큰의 경직성을 줄이기 위한 간단하면서도 효과적인 방법을 제안한다. 본 접근법은 학습 없이 또는 미세 조정 설정에서 지시 수행 및 수학적 추론을 포함한 여러 과제에서 LLM 성능을 일관적으로 향상시킨다. 또한, 본 방법이 주의 집중 싱크의 영향을 선택적으로 약화시켜 이를 완화하며, 은닉 상태 수준에서 그 기원을 규명함으로써 원칙적인 완화 전략에 대한 새로운 통찰을 제공함을 보여준다.

English

We investigate the origins of massive activations in large language models (LLMs) and identify a specific layer named the Massive Emergence Layer (ME Layer), that is consistently observed across model families, where massive activations first emerge and subsequently propagate to deeper layers through residual connections. We show that, within the ME Layer both the RMSNorm and the FFN parameters jointly contribute to the emergence of massive activations. Once formed, the massive activation token representation remains largely invariant across layers, reducing the diversity of hidden representations passed to the attention module. Motivated by this limitation, we propose a simple and effective method to reduce the rigidity of the massive activation token. Our approach consistently improves LLM performance across multiple tasks, including instruction following and math reasoning, in both training free and fine tuning settings. Moreover, we show that our method mitigates attention sinks by selectively weakening their influence, elucidating their origin at the hidden state level and shedding new light on principled mitigation strategies.

모든 거대 활성화를 설명하는 하나의 층: 대규모 언어 모델에서의 거대 활성화 이해

A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

초록

Support