사전 학습된 대규모 언어 모델은 컨텍스트 내에서 은닉 마르코프 모델을 학습한다

초록

히든 마코프 모델(Hidden Markov Models, HMMs)은 잠재적 마코프 구조를 가진 순차적 데이터를 모델링하는 데 있어 기초적인 도구이지만, 실제 데이터에 이를 적합시키는 것은 여전히 계산적으로 어려운 과제로 남아 있다. 본 연구에서는 사전 훈련된 대형 언어 모델(Large Language Models, LLMs)이 프롬프트 내 예시로부터 패턴을 추론하는 능력인 인컨텍스트 학습(In-Context Learning, ICL)을 통해 HMMs에 의해 생성된 데이터를 효과적으로 모델링할 수 있음을 보여준다. 다양한 합성 HMMs 데이터셋에서 LLMs는 이론적 최적치에 근접한 예측 정확도를 달성했다. 우리는 HMM 특성에 영향을 받는 새로운 스케일링 경향성을 발견하고, 이러한 실험적 관찰에 대한 이론적 가설을 제시한다. 또한 과학자들이 복잡한 데이터를 진단하는 도구로 ICL을 활용할 수 있는 실용적인 가이드라인을 제공한다. 실제 동물 의사결정 과제에서 ICL은 인간 전문가가 설계한 모델과 경쟁력 있는 성능을 보였다. 우리가 아는 한, 이는 ICL이 HMM 생성 시퀀스를 학습하고 예측할 수 있다는 첫 번째 실증으로, LLMs의 인컨텍스트 학습에 대한 이해를 심화시키고 복잡한 과학 데이터에서 숨겨진 구조를 발견하는 강력한 도구로서의 잠재력을 입증한다.

English

Hidden Markov Models (HMMs) are foundational tools for modeling sequential data with latent Markovian structure, yet fitting them to real-world data remains computationally challenging. In this work, we show that pre-trained large language models (LLMs) can effectively model data generated by HMMs via in-context learning (ICL)x2013their ability to infer patterns from examples within a prompt. On a diverse set of synthetic HMMs, LLMs achieve predictive accuracy approaching the theoretical optimum. We uncover novel scaling trends influenced by HMM properties, and offer theoretical conjectures for these empirical observations. We also provide practical guidelines for scientists on using ICL as a diagnostic tool for complex data. On real-world animal decision-making tasks, ICL achieves competitive performance with models designed by human experts. To our knowledge, this is the first demonstration that ICL can learn and predict HMM-generated sequencesx2013an advance that deepens our understanding of in-context learning in LLMs and establishes its potential as a powerful tool for uncovering hidden structure in complex scientific data.

사전 학습된 대규모 언어 모델은 컨텍스트 내에서 은닉 마르코프 모델을 학습한다

Pre-trained Large Language Models Learn Hidden Markov Models In-context

초록

Support