事前学習済み大規模言語モデルは文脈内で隠れマルコフモデルを学習する

要旨

隠れマルコフモデル（HMM）は、潜在的なマルコフ構造を持つ時系列データをモデル化するための基本的なツールであるが、実世界のデータに適合させることは依然として計算上の課題である。本研究では、事前学習済みの大規模言語モデル（LLM）が、文脈内学習（ICL）を通じてHMMによって生成されたデータを効果的にモデル化できることを示す。ICLとは、プロンプト内の例からパターンを推論する能力である。多様な合成HMMのセットにおいて、LLMは理論上の最適値に近い予測精度を達成する。我々は、HMMの特性に影響を受けた新しいスケーリングの傾向を明らかにし、これらの経験的観察に対する理論的な推測を提示する。また、科学者が複雑なデータの診断ツールとしてICLを使用するための実践的なガイドラインを提供する。実世界の動物の意思決定タスクにおいて、ICLは人間の専門家によって設計されたモデルと競争力のある性能を達成する。我々の知る限り、これはICLがHMM生成シーケンスを学習し予測できることを初めて実証したものであり、LLMにおける文脈内学習の理解を深め、複雑な科学データの隠れた構造を解明するための強力なツールとしての可能性を確立するものである。

English

Hidden Markov Models (HMMs) are foundational tools for modeling sequential data with latent Markovian structure, yet fitting them to real-world data remains computationally challenging. In this work, we show that pre-trained large language models (LLMs) can effectively model data generated by HMMs via in-context learning (ICL)x2013their ability to infer patterns from examples within a prompt. On a diverse set of synthetic HMMs, LLMs achieve predictive accuracy approaching the theoretical optimum. We uncover novel scaling trends influenced by HMM properties, and offer theoretical conjectures for these empirical observations. We also provide practical guidelines for scientists on using ICL as a diagnostic tool for complex data. On real-world animal decision-making tasks, ICL achieves competitive performance with models designed by human experts. To our knowledge, this is the first demonstration that ICL can learn and predict HMM-generated sequencesx2013an advance that deepens our understanding of in-context learning in LLMs and establishes its potential as a powerful tool for uncovering hidden structure in complex scientific data.

事前学習済み大規模言語モデルは文脈内で隠れマルコフモデルを学習する

Pre-trained Large Language Models Learn Hidden Markov Models In-context

要旨

Support