情境學習策略的理性湧現
In-Context Learning Strategies Emerge Rationally
June 21, 2025
作者: Daniel Wurgaft, Ekdeep Singh Lubana, Core Francisco Park, Hidenori Tanaka, Gautam Reddy, Noah D. Goodman
cs.AI
摘要
近期針對上下文學習(ICL)的分析工作,已識別出一系列廣泛的策略,這些策略描述了模型在不同實驗條件下的行為。我們旨在通過探討模型為何首先學習這些各異的策略來統一這些發現。具體而言,我們從觀察出發,當模型被訓練以學習一系列任務的混合體(這在文獻中頗為常見)時,模型為執行ICL所學習的策略,可以被一組貝葉斯預測器所捕捉:一個記憶型預測器,它假設對已見任務集有一個離散先驗;以及一個泛化型預測器,其先驗與底層任務分佈相匹配。採用理性分析這一規範視角,即學習者的行為被解釋為在計算約束下對數據的最優適應,我們發展了一個層次貝葉斯框架,該框架幾乎完美地預測了Transformer在整個訓練過程中的下一個詞預測——無需假設對其權重的訪問。在此框架下,預訓練被視為更新不同策略後驗概率的過程,而推理時的行為則被視為這些策略預測的後驗加權平均。我們的框架借鑒了關於神經網絡學習動態的常見假設,這些假設明確了候選策略在損失與複雜性之間的權衡:除了策略對數據的解釋能力外,模型對實施某一策略的偏好還由其複雜性決定。這有助於解釋眾所周知的ICL現象,同時提供新穎的預測:例如,我們展示了隨著任務多樣性的增加,從泛化過渡到記憶的時間尺度呈現超線性趨勢。總體而言,我們的工作基於策略損失與複雜性之間的權衡,推進了一種對ICL的解釋性和預測性理解。
English
Recent work analyzing in-context learning (ICL) has identified a broad set of
strategies that describe model behavior in different experimental conditions.
We aim to unify these findings by asking why a model learns these disparate
strategies in the first place. Specifically, we start with the observation that
when trained to learn a mixture of tasks, as is popular in the literature, the
strategies learned by a model for performing ICL can be captured by a family of
Bayesian predictors: a memorizing predictor, which assumes a discrete prior on
the set of seen tasks, and a generalizing predictor, where the prior matches
the underlying task distribution. Adopting the normative lens of rational
analysis, where a learner's behavior is explained as an optimal adaptation to
data given computational constraints, we develop a hierarchical Bayesian
framework that almost perfectly predicts Transformer next-token predictions
throughout training -- without assuming access to its weights. Under this
framework, pretraining is viewed as a process of updating the posterior
probability of different strategies, and inference-time behavior as a
posterior-weighted average over these strategies' predictions. Our framework
draws on common assumptions about neural network learning dynamics, which make
explicit a tradeoff between loss and complexity among candidate strategies:
beyond how well it explains the data, a model's preference towards implementing
a strategy is dictated by its complexity. This helps explain well-known ICL
phenomena, while offering novel predictions: e.g., we show a superlinear trend
in the timescale for transitioning from generalization to memorization as task
diversity increases. Overall, our work advances an explanatory and predictive
account of ICL grounded in tradeoffs between strategy loss and complexity.