基础模型发现了什么？利用归纳偏置探索世界模型

摘要

基础模型基于这样一个理念：序列预测能够揭示更深层次的领域理解，正如开普勒对行星运动的预测后来促成了牛顿力学的发现。然而，评估这些模型是否真正捕捉到了深层结构仍是一个挑战。我们开发了一种评估基础模型的技术，通过观察它们如何适应从某些假设的世界模型生成的合成数据集来进行。该技术衡量基础模型的归纳偏好是否与世界模型一致，因此我们称之为归纳偏好探针。在多个领域中，我们发现基础模型虽然能在其训练任务上表现出色，但在适应新任务时却未能形成对底层世界模型的归纳偏好。特别是，我们发现那些在轨道轨迹上训练的基础模型，在适应新的物理任务时，始终无法应用牛顿力学。进一步分析表明，这些模型的行为似乎发展出了无法泛化的任务特定启发式方法。

English

Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion later led to the discovery of Newtonian mechanics. However, evaluating whether these models truly capture deeper structure remains a challenge. We develop a technique for evaluating foundation models that examines how they adapt to synthetic datasets generated from some postulated world model. Our technique measures whether the foundation model's inductive bias aligns with the world model, and so we refer to it as an inductive bias probe. Across multiple domains, we find that foundation models can excel at their training tasks yet fail to develop inductive biases towards the underlying world model when adapted to new tasks. We particularly find that foundation models trained on orbital trajectories consistently fail to apply Newtonian mechanics when adapted to new physics tasks. Further analysis reveals that these models behave as if they develop task-specific heuristics that fail to generalize.

基础模型发现了什么？利用归纳偏置探索世界模型

What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models

摘要

Support