CAJun：使用学习的重心控制器进行连续自适应跳跃

摘要

我们提出了CAJun，这是一个新颖的分层学习和控制框架，使四足机器人能够以自适应的跳跃距离连续跳跃。CAJun包括一个高层次的质心策略和一个低层次的腿控制器。具体而言，我们使用强化学习（RL）来训练质心策略，该策略指定了腿控制器的步态时序、基础速度和摆动脚位置。腿控制器根据步态时序优化摆动腿和支撑腿的电机指令，以跟踪摆动脚目标和基础速度指令，采用最优控制方法。此外，我们重新制定了腿控制器中支撑腿优化器，使策略训练速度提高一个数量级。我们的系统结合了学习的多功能性和最优控制的稳健性。通过将RL与最优控制方法结合，我们的系统实现了学习的多功能性，同时又享受了控制方法的稳健性，使其易于转移到真实机器人中。我们展示了在单个GPU上进行20分钟训练后，CAJun能够在Go1机器人上实现连续、远距离自适应跳跃，且在模拟到真实之间存在较小差距。此外，该机器人可以跨越最大宽度为70厘米的缝隙，比现有方法宽40%以上。

English

We present CAJun, a novel hierarchical learning and control framework that enables legged robots to jump continuously with adaptive jumping distances. CAJun consists of a high-level centroidal policy and a low-level leg controller. In particular, we use reinforcement learning (RL) to train the centroidal policy, which specifies the gait timing, base velocity, and swing foot position for the leg controller. The leg controller optimizes motor commands for the swing and stance legs according to the gait timing to track the swing foot target and base velocity commands using optimal control. Additionally, we reformulate the stance leg optimizer in the leg controller to speed up policy training by an order of magnitude. Our system combines the versatility of learning with the robustness of optimal control. By combining RL with optimal control methods, our system achieves the versatility of learning while enjoys the robustness from control methods, making it easily transferable to real robots. We show that after 20 minutes of training on a single GPU, CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot with small sim-to-real gaps. Moreover, the robot can jump across gaps with a maximum width of 70cm, which is over 40% wider than existing methods.

CAJun：使用学习的重心控制器进行连续自适应跳跃

CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller

摘要

Support