CAJun: 学習された重心制御器を用いた連続適応型ジャンプ

要旨

本論文では、CAJunという新しい階層型学習・制御フレームワークを提案します。このフレームワークは、脚式ロボットが適応的な跳躍距離で連続的にジャンプすることを可能にします。CAJunは、高レベルの重心制御ポリシーと低レベルの脚制御器で構成されています。特に、強化学習（RL）を用いて重心制御ポリシーを訓練し、このポリシーは脚制御器に対して歩容タイミング、ベース速度、およびスイング脚の位置を指定します。脚制御器は、歩容タイミングに従ってスイング脚とスタンス脚のモーターコマンドを最適化し、最適制御を用いてスイング脚の目標位置とベース速度コマンドを追跡します。さらに、脚制御器内のスタンス脚最適化器を再定式化することで、ポリシー訓練を1桁高速化しました。本システムは、学習の汎用性と最適制御の堅牢性を組み合わせています。RLと最適制御手法を組み合わせることで、学習の汎用性を維持しつつ制御手法の堅牢性を享受し、実機への容易な転移を実現します。単一のGPUで20分間の訓練後、CAJunはGo1ロボット上でシミュレーションと実機のギャップが小さい状態で、適応的な距離での連続的な長距離ジャンプを達成できることを示します。さらに、ロボットは最大70cmの幅の溝を飛び越えることができ、これは既存の手法よりも40%以上広い幅です。

English

We present CAJun, a novel hierarchical learning and control framework that enables legged robots to jump continuously with adaptive jumping distances. CAJun consists of a high-level centroidal policy and a low-level leg controller. In particular, we use reinforcement learning (RL) to train the centroidal policy, which specifies the gait timing, base velocity, and swing foot position for the leg controller. The leg controller optimizes motor commands for the swing and stance legs according to the gait timing to track the swing foot target and base velocity commands using optimal control. Additionally, we reformulate the stance leg optimizer in the leg controller to speed up policy training by an order of magnitude. Our system combines the versatility of learning with the robustness of optimal control. By combining RL with optimal control methods, our system achieves the versatility of learning while enjoys the robustness from control methods, making it easily transferable to real robots. We show that after 20 minutes of training on a single GPU, CAJun can achieve continuous, long jumps with adaptive distances on a Go1 robot with small sim-to-real gaps. Moreover, the robot can jump across gaps with a maximum width of 70cm, which is over 40% wider than existing methods.

CAJun: 学習された重心制御器を用いた連続適応型ジャンプ

CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller

要旨

Support