控制的廣義動力學模型
A Generalist Dynamics Model for Control
May 18, 2023
作者: Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto, Martin Riedmiller, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Nicolas Heess
cs.AI
摘要
我們研究了將Transformer序列模型作為動態模型(TDMs)用於控制的應用。在DeepMind控制套件的多個實驗中,我們發現首先,與基準模型相比,TDMs在單一環境學習設置中表現良好。其次,TDMs展現了強大的泛化能力,能夠適應未見環境,包括在少樣本設置中,通過用來自目標環境的少量數據對通用模型進行微調,以及在零樣本設置中,將通用模型應用於未見環境而無需進行進一步訓練。我們進一步證明,泛化系統動態比直接泛化最優行為作為策略要好得多。這使得TDMs成為控制基礎模型的一個有前途的組成部分。
English
We investigate the use of transformer sequence models as dynamics models
(TDMs) for control. In a number of experiments in the DeepMind control suite,
we find that first, TDMs perform well in a single-environment learning setting
when compared to baseline models. Second, TDMs exhibit strong generalization
capabilities to unseen environments, both in a few-shot setting, where a
generalist model is fine-tuned with small amounts of data from the target
environment, and in a zero-shot setting, where a generalist model is applied to
an unseen environment without any further training. We further demonstrate that
generalizing system dynamics can work much better than generalizing optimal
behavior directly as a policy. This makes TDMs a promising ingredient for a
foundation model of control.