트랜스포머에서 메사 최적화 알고리즘의 발견

초록

트랜스포머는 딥러닝 분야에서 지배적인 모델로 자리 잡았지만, 그 뛰어난 성능의 원인은 잘 이해되지 않고 있습니다. 본 연구에서는 트랜스포머의 강력한 성능이 메사 최적화(mesa-optimization)에 대한 아키텍처적 편향에서 비롯된다는 가설을 제안합니다. 메사 최적화란 모델의 순전파 과정에서 실행되는 학습된 프로세스로, 다음 두 단계로 구성됩니다: (i) 내부 학습 목표의 구성, 그리고 (ii) 최적화를 통해 찾은 해당 목표에 대한 해결책. 이 가설을 검증하기 위해, 우리는 간단한 시퀀스 모델링 작업에 대해 학습된 일련의 자기회귀 트랜스포머를 역공학하여 예측 생성 과정을 이끄는 기저의 경사 기반 메사 최적화 알고리즘을 발견했습니다. 더 나아가, 학습된 순전파 최적화 알고리즘이 지도 학습의 소수 샷(few-shot) 작업을 해결하는 데 즉각적으로 재사용될 수 있음을 보여주며, 이는 메사 최적화가 대규모 언어 모델의 문맥 내 학습(in-context learning) 능력의 기반이 될 수 있음을 시사합니다. 마지막으로, 우리는 문맥에서 지정된 최적화 문제를 명시적이고 효율적으로 해결하는 새로운 셀프 어텐션 계층인 메사 레이어(mesa-layer)를 제안합니다. 이 계층이 합성 및 예비 언어 모델링 실험에서 성능 향상을 이끌어낼 수 있음을 발견함으로써, 메사 최적화가 학습된 트랜스포머의 가중치 내에 숨겨진 중요한 연산일 수 있다는 우리의 가설에 더욱 무게를 실어줍니다.

English

Transformers have become the dominant model in deep learning, but the reason for their superior performance is poorly understood. Here, we hypothesize that the strong performance of Transformers stems from an architectural bias towards mesa-optimization, a learned process running within the forward pass of a model consisting of the following two steps: (i) the construction of an internal learning objective, and (ii) its corresponding solution found through optimization. To test this hypothesis, we reverse-engineer a series of autoregressive Transformers trained on simple sequence modeling tasks, uncovering underlying gradient-based mesa-optimization algorithms driving the generation of predictions. Moreover, we show that the learned forward-pass optimization algorithm can be immediately repurposed to solve supervised few-shot tasks, suggesting that mesa-optimization might underlie the in-context learning capabilities of large language models. Finally, we propose a novel self-attention layer, the mesa-layer, that explicitly and efficiently solves optimization problems specified in context. We find that this layer can lead to improved performance in synthetic and preliminary language modeling experiments, adding weight to our hypothesis that mesa-optimization is an important operation hidden within the weights of trained Transformers.

트랜스포머에서 메사 최적화 알고리즘의 발견

Uncovering mesa-optimization algorithms in Transformers

초록

Support