CAJun:使用學習到的質心控制器進行連續自適應跳躍
CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller
June 16, 2023
作者: Yuxiang Yang, Guanya Shi, Xiangyun Meng, Wenhao Yu, Tingnan Zhang, Jie Tan, Byron Boots
cs.AI
摘要
我們提出了CAJun,一個新穎的階層式學習和控制框架,使四足機器人能夠以適應性跳躍距離連續跳躍。CAJun包括高層次的質心策略和低層次的腿部控制器。具體而言,我們使用強化學習(RL)來訓練質心策略,該策略指定了腿部控制器的步態定時、基準速度和擺動腳位置。腿部控制器根據步態定時對擺動腳目標和基準速度指令進行最優控制,優化擺動腿和支撐腿的馬達指令。此外,我們重新制定了腿部控制器中的支撐腿優化器,將策略訓練速度提高了一個數量級。我們的系統結合了學習的靈活性和最優控制的穩健性。通過將RL與最優控制方法結合,我們的系統實現了學習的靈活性,同時享受控制方法的穩健性,使其易於應用於真實機器人。我們展示,在單個GPU上訓練20分鐘後,CAJun能夠在Go1機器人上實現連續、長距離的適應性跳躍,並且在模擬與真實之間存在較小的差距。此外,該機器人可以跨越最大寬度為70厘米的間隙,比現有方法寬40%以上。
English
We present CAJun, a novel hierarchical learning and control framework that
enables legged robots to jump continuously with adaptive jumping distances.
CAJun consists of a high-level centroidal policy and a low-level leg
controller. In particular, we use reinforcement learning (RL) to train the
centroidal policy, which specifies the gait timing, base velocity, and swing
foot position for the leg controller. The leg controller optimizes motor
commands for the swing and stance legs according to the gait timing to track
the swing foot target and base velocity commands using optimal control.
Additionally, we reformulate the stance leg optimizer in the leg controller to
speed up policy training by an order of magnitude. Our system combines the
versatility of learning with the robustness of optimal control. By combining RL
with optimal control methods, our system achieves the versatility of learning
while enjoys the robustness from control methods, making it easily transferable
to real robots. We show that after 20 minutes of training on a single GPU,
CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot
with small sim-to-real gaps. Moreover, the robot can jump across gaps with a
maximum width of 70cm, which is over 40% wider than existing methods.