ChatPaper.aiChatPaper

OceanGym:水下具身代理的基準測試環境

OceanGym: A Benchmark Environment for Underwater Embodied Agents

September 30, 2025
作者: Yida Xue, Mingjun Mao, Xiangyuan Ru, Yuqi Zhu, Baochang Ren, Shuofei Qiao, Mengru Wang, Shumin Deng, Xinyu An, Ningyu Zhang, Ying Chen, Huajun Chen
cs.AI

摘要

我們推出OceanGym,這是首個專為海洋水下實體智能體設計的綜合基準測試平台,旨在推動AI在最具挑戰性的現實環境中的發展。與陸地或空中領域不同,水下環境面臨極端的感知與決策挑戰,包括低能見度、動態洋流等,使得智能體的有效部署異常困難。OceanGym涵蓋了八個真實任務領域,並採用由多模態大型語言模型(MLLMs)驅動的統一智能體框架,該框架整合了感知、記憶與序列決策能力。智能體需理解光學與聲納數據,在複雜環境中自主探索,並在這些嚴苛條件下完成長期目標。大量實驗表明,當前最先進的MLLM驅動智能體與人類專家之間存在顯著差距,凸顯了在海洋水下環境中感知、規劃與適應性的持續難題。通過提供一個高保真、精心設計的平台,OceanGym為開發強健的實體AI及將這些能力轉移至現實世界的自主海洋水下載具建立了試驗場,標誌著向能夠在地球最後未探索疆域之一運作的智能體邁出了決定性的一步。代碼與數據可在https://github.com/OceanGPT/OceanGym獲取。
English
We introduce OceanGym, the first comprehensive benchmark for ocean underwater embodied agents, designed to advance AI in one of the most demanding real-world environments. Unlike terrestrial or aerial domains, underwater settings present extreme perceptual and decision-making challenges, including low visibility, dynamic ocean currents, making effective agent deployment exceptionally difficult. OceanGym encompasses eight realistic task domains and a unified agent framework driven by Multi-modal Large Language Models (MLLMs), which integrates perception, memory, and sequential decision-making. Agents are required to comprehend optical and sonar data, autonomously explore complex environments, and accomplish long-horizon objectives under these harsh conditions. Extensive experiments reveal substantial gaps between state-of-the-art MLLM-driven agents and human experts, highlighting the persistent difficulty of perception, planning, and adaptability in ocean underwater environments. By providing a high-fidelity, rigorously designed platform, OceanGym establishes a testbed for developing robust embodied AI and transferring these capabilities to real-world autonomous ocean underwater vehicles, marking a decisive step toward intelligent agents capable of operating in one of Earth's last unexplored frontiers. The code and data are available at https://github.com/OceanGPT/OceanGym.
PDF231October 1, 2025