기초 모델은 무엇을 발견했는가? 귀납적 편향을 활용한 세계 모델 탐구

초록

파운데이션 모델은 시퀀스 예측이 더 깊은 도메인 이해를 이끌어낼 수 있다는 아이디어에 기반을 두고 있으며, 이는 케플러의 행성 운동 예측이 나중에 뉴턴 역학의 발견으로 이어진 것과 유사합니다. 그러나 이러한 모델들이 진정으로 더 깊은 구조를 포착하고 있는지 평가하는 것은 여전히 과제로 남아 있습니다. 우리는 파운데이션 모델을 평가하기 위한 기법을 개발했는데, 이 기법은 특정 가정된 세계 모델에서 생성된 합성 데이터셋에 모델이 어떻게 적응하는지를 조사합니다. 우리의 기법은 파운데이션 모델의 귀납적 편향이 세계 모델과 일치하는지를 측정하므로, 이를 귀납적 편향 탐색기라고 부릅니다. 여러 도메인에 걸쳐, 우리는 파운데이션 모델이 훈련 작업에서는 뛰어난 성과를 보이지만 새로운 작업에 적응할 때 기본 세계 모델에 대한 귀납적 편향을 개발하지 못하는 경우를 발견했습니다. 특히, 궤적 데이터로 훈련된 파운데이션 모델들은 새로운 물리학 작업에 적응할 때 뉴턴 역학을 적용하지 못하는 것으로 일관되게 나타났습니다. 추가 분석 결과, 이러한 모델들은 일반화에 실패하는 작업별 휴리스틱을 개발하는 것처럼 행동하는 것으로 밝혀졌습니다.

English

Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion later led to the discovery of Newtonian mechanics. However, evaluating whether these models truly capture deeper structure remains a challenge. We develop a technique for evaluating foundation models that examines how they adapt to synthetic datasets generated from some postulated world model. Our technique measures whether the foundation model's inductive bias aligns with the world model, and so we refer to it as an inductive bias probe. Across multiple domains, we find that foundation models can excel at their training tasks yet fail to develop inductive biases towards the underlying world model when adapted to new tasks. We particularly find that foundation models trained on orbital trajectories consistently fail to apply Newtonian mechanics when adapted to new physics tasks. Further analysis reveals that these models behave as if they develop task-specific heuristics that fail to generalize.

기초 모델은 무엇을 발견했는가? 귀납적 편향을 활용한 세계 모델 탐구

What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models

초록

Support