从视频中学习识别强化学习的关键状态
Learning to Identify Critical States for Reinforcement Learning from Videos
August 15, 2023
作者: Haozhe Liu, Mingchen Zhuge, Bing Li, Yuhui Wang, Francesco Faccio, Bernard Ghanem, Jürgen Schmidhuber
cs.AI
摘要
最近关于深度强化学习(DRL)的研究指出,可以从缺乏有关执行动作明确信息的离线数据中提取有关良好策略的算法信息。例如,人类或机器人的视频可能传达了许多有关奖励动作序列的隐含信息,但想要从观看这些视频中获益的DRL机器必须首先自行学习识别和识别相关的状态/动作/奖励。在不依赖地面真实标注的情况下,我们提出了一种名为深度状态识别器的新方法,该方法学习从编码为视频的剧集中预测回报。然后使用一种基于掩码的敏感性分析来提取/识别重要的关键状态。大量实验证明了我们的方法在理解和改进代理行为方面的潜力。源代码和生成的数据集可在https://github.com/AI-Initiative-KAUST/VideoRLCS 上获得。
English
Recent work on deep reinforcement learning (DRL) has pointed out that
algorithmic information about good policies can be extracted from offline data
which lack explicit information about executed actions. For example, videos of
humans or robots may convey a lot of implicit information about rewarding
action sequences, but a DRL machine that wants to profit from watching such
videos must first learn by itself to identify and recognize relevant
states/actions/rewards. Without relying on ground-truth annotations, our new
method called Deep State Identifier learns to predict returns from episodes
encoded as videos. Then it uses a kind of mask-based sensitivity analysis to
extract/identify important critical states. Extensive experiments showcase our
method's potential for understanding and improving agent behavior. The source
code and the generated datasets are available at
https://github.com/AI-Initiative-KAUST/VideoRLCS.