从非分段演示中实现开放世界技能发现
Open-World Skill Discovery from Unsegmented Demonstrations
March 11, 2025
作者: Jingwen Deng, Zihao Wang, Shaofei Cai, Anji Liu, Yitao Liang
cs.AI
摘要
在开放世界环境中学习技能对于开发能够通过组合基本技能处理多种任务的智能体至关重要。在线演示视频通常较长且未分段,这使得它们难以被分割并标注技能标识。与依赖序列采样或人工标注的现有方法不同,我们开发了一种基于自监督学习的方法,将这些长视频分割成一系列语义感知且技能一致的片段。借鉴人类认知事件分割理论,我们引入了技能边界检测(SBD),这是一种无需标注的时间视频分割算法。SBD通过利用预训练的无条件动作预测模型的预测误差来检测视频中的技能边界。该方法基于一个假设,即预测误差的显著增加表明正在执行的技能发生了转变。我们在《我的世界》这一拥有丰富在线游戏视频的开放世界模拟器中评估了我们的方法。由SBD生成的片段将条件策略在短期原子技能任务上的平均性能提升了63.7%和52.1%,其对应的分层智能体在长期任务上的性能提升了11.3%和20.8%。我们的方法能够利用多样化的YouTube视频来训练遵循指令的智能体。项目页面可在https://craftjarvis.github.io/SkillDiscovery找到。
English
Learning skills in open-world environments is essential for developing agents
capable of handling a variety of tasks by combining basic skills. Online
demonstration videos are typically long but unsegmented, making them difficult
to segment and label with skill identifiers. Unlike existing methods that rely
on sequence sampling or human labeling, we have developed a self-supervised
learning-based approach to segment these long videos into a series of
semantic-aware and skill-consistent segments. Drawing inspiration from human
cognitive event segmentation theory, we introduce Skill Boundary Detection
(SBD), an annotation-free temporal video segmentation algorithm. SBD detects
skill boundaries in a video by leveraging prediction errors from a pretrained
unconditional action-prediction model. This approach is based on the assumption
that a significant increase in prediction error indicates a shift in the skill
being executed. We evaluated our method in Minecraft, a rich open-world
simulator with extensive gameplay videos available online. Our SBD-generated
segments improved the average performance of conditioned policies by 63.7% and
52.1% on short-term atomic skill tasks, and their corresponding hierarchical
agents by 11.3% and 20.8% on long-horizon tasks. Our method can leverage the
diverse YouTube videos to train instruction-following agents. The project page
can be found in https://craftjarvis.github.io/SkillDiscovery.Summary
AI-Generated Summary