控制的通用动力学模型
A Generalist Dynamics Model for Control
May 18, 2023
作者: Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto, Martin Riedmiller, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Nicolas Heess
cs.AI
摘要
我们研究了将Transformer序列模型作为控制动力学模型(TDMs)的应用。在DeepMind控制套件的多个实验中,我们发现首先,与基准模型相比,TDMs在单环境学习设置中表现良好。其次,TDMs表现出对未见环境的强大泛化能力,无论是在少样本设置中,其中通用模型经过少量来自目标环境的数据微调,还是在零样本设置中,其中通用模型应用于未见环境且无需进一步训练。我们进一步证明,泛化系统动力学比直接泛化最优行为作为策略要好得多。这使得TDMs成为控制基础模型的一个有前途的组成部分。
English
We investigate the use of transformer sequence models as dynamics models
(TDMs) for control. In a number of experiments in the DeepMind control suite,
we find that first, TDMs perform well in a single-environment learning setting
when compared to baseline models. Second, TDMs exhibit strong generalization
capabilities to unseen environments, both in a few-shot setting, where a
generalist model is fine-tuned with small amounts of data from the target
environment, and in a zero-shot setting, where a generalist model is applied to
an unseen environment without any further training. We further demonstrate that
generalizing system dynamics can work much better than generalizing optimal
behavior directly as a policy. This makes TDMs a promising ingredient for a
foundation model of control.