ChatPaper.aiChatPaper

控制的通用动力学模型

A Generalist Dynamics Model for Control

May 18, 2023
作者: Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto, Martin Riedmiller, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Nicolas Heess
cs.AI

摘要

我们研究了将Transformer序列模型作为控制动力学模型(TDMs)的应用。在DeepMind控制套件的多个实验中,我们发现首先,与基准模型相比,TDMs在单环境学习设置中表现良好。其次,TDMs表现出对未见环境的强大泛化能力,无论是在少样本设置中,其中通用模型经过少量来自目标环境的数据微调,还是在零样本设置中,其中通用模型应用于未见环境且无需进一步训练。我们进一步证明,泛化系统动力学比直接泛化最优行为作为策略要好得多。这使得TDMs成为控制基础模型的一个有前途的组成部分。
English
We investigate the use of transformer sequence models as dynamics models (TDMs) for control. In a number of experiments in the DeepMind control suite, we find that first, TDMs perform well in a single-environment learning setting when compared to baseline models. Second, TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist model is fine-tuned with small amounts of data from the target environment, and in a zero-shot setting, where a generalist model is applied to an unseen environment without any further training. We further demonstrate that generalizing system dynamics can work much better than generalizing optimal behavior directly as a policy. This makes TDMs a promising ingredient for a foundation model of control.
PDF10December 15, 2024