從頭開始發現可適應性符號算法
Discovering Adaptable Symbolic Algorithms from Scratch
July 31, 2023
作者: Stephen Kelly, Daniel S. Park, Xingyou Song, Mitchell McIntire, Pranav Nashikkar, Ritam Guha, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti, Jie Tan, Esteban Real
cs.AI
摘要
在現實世界中部署的自主機器人將需要能夠快速適應環境變化的控制策略。為此,我們提出了AutoRobotics-Zero(ARZ)方法,該方法基於AutoML-Zero,從頭開始發現零-shot 可適應策略。與僅優化模型參數的神經網絡適應策略相比,ARZ 可以構建具有線性寄存器機器的完整表達能力的控制算法。我們演化模塊化策略,調整其模型參數並即時改變其推理算法,以適應突然的環境變化。我們在一個逼真的模擬四足機器人上展示了我們的方法,通過演化安全控制策略,使其在單個肢體突然斷裂時避免跌倒。這是一項具有挑戰性的任務,兩種流行的神經網絡基準線都失敗了。最後,我們對一個名為Cataclysmic Cartpole 的新穎且具有挑戰性的非靜態控制任務進行了詳細分析。結果證實了我們的發現,即ARZ 對突然的環境變化更具韌性,並且可以構建簡單且可解釋的控制策略。
English
Autonomous robots deployed in the real world will need control policies that
rapidly adapt to environmental changes. To this end, we propose
AutoRobotics-Zero (ARZ), a method based on AutoML-Zero that discovers zero-shot
adaptable policies from scratch. In contrast to neural network adaption
policies, where only model parameters are optimized, ARZ can build control
algorithms with the full expressive power of a linear register machine. We
evolve modular policies that tune their model parameters and alter their
inference algorithm on-the-fly to adapt to sudden environmental changes. We
demonstrate our method on a realistic simulated quadruped robot, for which we
evolve safe control policies that avoid falling when individual limbs suddenly
break. This is a challenging task in which two popular neural network baselines
fail. Finally, we conduct a detailed analysis of our method on a novel and
challenging non-stationary control task dubbed Cataclysmic Cartpole. Results
confirm our findings that ARZ is significantly more robust to sudden
environmental changes and can build simple, interpretable control policies.