記憶、基準與機器人：一個用於強化學習解決複雜任務的基準

摘要

記憶對於使智慧體能夠處理具有時空依賴性的複雜任務至關重要。雖然許多強化學習（RL）算法都融入了記憶機制，但該領域仍缺乏一個通用基準來評估智慧體在各種情境下的記憶能力。這一不足在桌面機器人操作中尤為明顯，在該領域，記憶對於解決部分可觀測性任務和確保穩健性能至關重要，然而目前尚無標準化的基準測試。為此，我們推出了MIKASA（記憶密集型技能評估套件），這是一個全面的記憶強化學習基準，具有三大貢獻：(1) 我們提出了一個記憶密集型強化學習任務的綜合分類框架，(2) 我們收集了MIKASA-Base——一個統一的基準，支持在不同場景下系統性地評估增強記憶的智慧體，以及(3) 我們開發了MIKASA-Robo——一個包含32個精心設計的記憶密集型任務的新基準，用於評估桌面機器人操作中的記憶能力。我們的貢獻為推進記憶強化學習研究建立了一個統一框架，推動了更可靠系統在實際應用中的發展。相關代碼可在https://sites.google.com/view/memorybenchrobots/獲取。

English

Memory is crucial for enabling agents to tackle complex tasks with temporal and spatial dependencies. While many reinforcement learning (RL) algorithms incorporate memory, the field lacks a universal benchmark to assess an agent's memory capabilities across diverse scenarios. This gap is particularly evident in tabletop robotic manipulation, where memory is essential for solving tasks with partial observability and ensuring robust performance, yet no standardized benchmarks exist. To address this, we introduce MIKASA (Memory-Intensive Skills Assessment Suite for Agents), a comprehensive benchmark for memory RL, with three key contributions: (1) we propose a comprehensive classification framework for memory-intensive RL tasks, (2) we collect MIKASA-Base - a unified benchmark that enables systematic evaluation of memory-enhanced agents across diverse scenarios, and (3) we develop MIKASA-Robo - a novel benchmark of 32 carefully designed memory-intensive tasks that assess memory capabilities in tabletop robotic manipulation. Our contributions establish a unified framework for advancing memory RL research, driving the development of more reliable systems for real-world applications. The code is available at https://sites.google.com/view/memorybenchrobots/.

記憶、基準與機器人：一個用於強化學習解決複雜任務的基準

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

摘要

Support