从零开始发现可适应的符号算法
Discovering Adaptable Symbolic Algorithms from Scratch
July 31, 2023
作者: Stephen Kelly, Daniel S. Park, Xingyou Song, Mitchell McIntire, Pranav Nashikkar, Ritam Guha, Wolfgang Banzhaf, Kalyanmoy Deb, Vishnu Naresh Boddeti, Jie Tan, Esteban Real
cs.AI
摘要
部署在现实世界中的自主机器人将需要快速适应环境变化的控制策略。为此,我们提出AutoRobotics-Zero(ARZ)方法,基于AutoML-Zero,从零开始发现零-shot可适应的策略。与仅优化模型参数的神经网络适应策略相比,ARZ可以构建具有线性寄存器机器的完整表达能力的控制算法。我们演化模块化策略,调整其模型参数并即时改变推理算法,以适应突发环境变化。我们在一个逼真的模拟四足机器人上展示了我们的方法,为其演化出安全的控制策略,避免在单个肢体突然断裂时摔倒。这是一个具有挑战性的任务,在这个任务中,两种流行的神经网络基线失败了。最后,我们在一个名为灾变摆杆的新颖且具有挑战性的非静态控制任务上对我们的方法进行了详细分析。结果证实了我们的发现,即ARZ对突发环境变化更加稳健,并且可以构建简单且可解释的控制策略。
English
Autonomous robots deployed in the real world will need control policies that
rapidly adapt to environmental changes. To this end, we propose
AutoRobotics-Zero (ARZ), a method based on AutoML-Zero that discovers zero-shot
adaptable policies from scratch. In contrast to neural network adaption
policies, where only model parameters are optimized, ARZ can build control
algorithms with the full expressive power of a linear register machine. We
evolve modular policies that tune their model parameters and alter their
inference algorithm on-the-fly to adapt to sudden environmental changes. We
demonstrate our method on a realistic simulated quadruped robot, for which we
evolve safe control policies that avoid falling when individual limbs suddenly
break. This is a challenging task in which two popular neural network baselines
fail. Finally, we conduct a detailed analysis of our method on a novel and
challenging non-stationary control task dubbed Cataclysmic Cartpole. Results
confirm our findings that ARZ is significantly more robust to sudden
environmental changes and can build simple, interpretable control policies.