언어 모델은 무엇을 언제 배우는가? 암묵적 교육과정 가설

초록

대규모 언어 모델(LLM)은 놀랍도록 복잡한 작업을 수행할 수 있지만, 사전 학습 과정에서 이러한 능력이 어떻게 발현되는지에 대한 세부적인 메커니즘은 아직 명확히 이해되지 않고 있습니다. 검증 손실에 대한 확장 법칙은 추가 계산 자원에 따라 모델이 얼마나 개선되는지는 알려주지만, 어떤 순서로 어떤 기술을 습득하는지는 설명하지 못합니다. 이를 해결하기 위해 우리는 '암시적 커리큘럼 가설'을 제안합니다: 사전 학습은 모델과 데이터 조합에 걸쳐 구성적이고 예측 가능한 커리큘럼을 따른다는 것입니다. 우리는 이를 검증하기 위해 검색, 형태론적 변환, 공지시, 논리적 추론, 수학을 아우르는 간단하고 구성 가능한 작업 세트를 설계했습니다. 이 작업들을 사용하여 410M~13B 매개변수 규모의 4개 모델 패밀리에서 능력 발현 시점을 추적했습니다. 그 결과, 모델이 고정 정확도 임계값에 도달하는 발현 순서가 놀라울 정도로 일관적이며(45개 모델 쌍에서 ρ= .81), 복합 작업은 대부분 구성 요소 작업 이후에 발현된다는 것을 발견했습니다. 더 나아가, 이러한 구조가 모델 표현에 인코딩되어 있음을 확인했습니다: 기능 벡터 표현이 유사한 작업들은 훈련 과정에서도 유사한轨迹를 따르는 경향이 있었습니다. 우리의 작업 세트에서 도출된 표현 공간을 활용하면, 사전 평가 없이도 사전 학습 전 과정에 걸쳐 간단한 보유 구성 작업의 훈련轨迹를 효과적으로 예측할 수 있었습니다(모델별 R^2 = .68-.84). 이러한 결과들은 종합적으로 사전 학습이 손실 곡선이 보여주는 것보다 더 구조화되어 있음을 시사합니다. 즉, 기술은 모델 간에 일관된 구성적 순서로 발현되며, 이는 모델의 내부 상태를 통해 읽어낼 수 있습니다.

English

Large language models (LLMs) can perform remarkably complex tasks, yet the fine-grained details of how these capabilities emerge during pretraining remain poorly understood. Scaling laws on validation loss tell us how much a model improves with additional compute, but not what skills it acquires in which order. To remedy this, we propose the Implicit Curriculum Hypothesis: pretraining follows a compositional and predictable curriculum across models and data mixtures. We test this by designing a suite of simple, composable tasks spanning retrieval, morphological transformations, coreference, logical reasoning, and mathematics. Using these tasks, we track emergence points across four model families spanning sizes from 410M-13B parameters. We find that emergence orderings of when models reach fixed accuracy thresholds are strikingly consistent (ρ= .81 across 45 model pairs), and that composite tasks most often emerge after their component tasks. Furthermore, we find that this structure is encoded in model representations: tasks with similar function vector representations also tend to follow similar trajectories in training. By using the space of representations derived from our task set, we can effectively predict the training trajectories of simple held-out compositional tasks throughout the course of pretraining (R^2 = .68-.84 across models) without previously evaluating them. Together, these results suggest that pretraining is more structured than loss curves reveal: skills emerge in a compositional order that is consistent across models and readable from their internals.

언어 모델은 무엇을 언제 배우는가? 암묵적 교육과정 가설

What do Language Models Learn and When? The Implicit Curriculum Hypothesis

초록

Support