基礎模型發現了什麼?利用歸納偏誤探尋世界模型
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
July 9, 2025
作者: Keyon Vafa, Peter G. Chang, Ashesh Rambachan, Sendhil Mullainathan
cs.AI
摘要
基础模型建立在序列预测能够揭示更深层次领域理解的理念之上,正如开普勒对行星运动的预测后来促成了牛顿力学的发现。然而,评估这些模型是否真正捕捉到了更深层次的结构仍是一个挑战。我们开发了一种评估基础模型的技术,该技术通过观察模型如何适应从某些假设的世界模型生成的合成数据集来进行。我们的技术旨在衡量基础模型的归纳偏好是否与世界模型一致,因此我们将其称为归纳偏好探针。在多个领域中,我们发现基础模型虽然在训练任务上表现出色,但在适应新任务时却未能发展出对底层世界模型的归纳偏好。特别是,我们发现那些基于轨道轨迹训练的基础模型在适应新的物理任务时,始终无法应用牛顿力学。进一步的分析揭示,这些模型的行为似乎发展出了无法泛化的任务特定启发式方法。
English
Foundation models are premised on the idea that sequence prediction can
uncover deeper domain understanding, much like how Kepler's predictions of
planetary motion later led to the discovery of Newtonian mechanics. However,
evaluating whether these models truly capture deeper structure remains a
challenge. We develop a technique for evaluating foundation models that
examines how they adapt to synthetic datasets generated from some postulated
world model. Our technique measures whether the foundation model's inductive
bias aligns with the world model, and so we refer to it as an inductive bias
probe. Across multiple domains, we find that foundation models can excel at
their training tasks yet fail to develop inductive biases towards the
underlying world model when adapted to new tasks. We particularly find that
foundation models trained on orbital trajectories consistently fail to apply
Newtonian mechanics when adapted to new physics tasks. Further analysis reveals
that these models behave as if they develop task-specific heuristics that fail
to generalize.