OpenGVL - データキュレーションのための視覚的時間進行のベンチマーキング

要旨

データの不足は、ロボティクスの進歩を阻む最も大きな制約要因の一つです。しかし、現実世界で利用可能なロボティクスデータの量は指数関数的に増加しており、大規模なデータ活用の新たな機会を生み出しています。信頼性の高い時間的タスク完了予測は、このデータを自動的に注釈付けし、大規模にキュレーションするのに役立つ可能性があります。最近、Generative Value Learning（GVL）アプローチが提案され、視覚言語モデル（VLMs）に埋め込まれた知識を活用して視覚観察からタスクの進捗を予測することが可能になりました。GVLを基盤として、我々はOpenGVLを提案します。これは、ロボットと人間の両方の身体化を含む多様で挑戦的な操作タスクにおけるタスク進捗を推定するための包括的なベンチマークです。我々は、公開されているオープンソースの基盤モデルの能力を評価し、オープンソースのモデルファミリーがクローズドソースのモデルに比べて大幅に性能が劣り、時間的進捗予測タスクにおいて約70%の性能しか達成できないことを示しました。さらに、OpenGVLが自動化されたデータキュレーションとフィルタリングの実用的なツールとして機能し、大規模なロボティクスデータセットの品質評価を効率的に行うことができることを実証します。我々は、ベンチマークと完全なコードベースをgithub.com/budzianowski/opengvl{OpenGVL}で公開します。

English

Data scarcity remains one of the most limiting factors in driving progress in robotics. However, the amount of available robotics data in the wild is growing exponentially, creating new opportunities for large-scale data utilization. Reliable temporal task completion prediction could help automatically annotate and curate this data at scale. The Generative Value Learning (GVL) approach was recently proposed, leveraging the knowledge embedded in vision-language models (VLMs) to predict task progress from visual observations. Building upon GVL, we propose OpenGVL, a comprehensive benchmark for estimating task progress across diverse challenging manipulation tasks involving both robotic and human embodiments. We evaluate the capabilities of publicly available open-source foundation models, showing that open-source model families significantly underperform closed-source counterparts, achieving only approximately 70% of their performance on temporal progress prediction tasks. Furthermore, we demonstrate how OpenGVL can serve as a practical tool for automated data curation and filtering, enabling efficient quality assessment of large-scale robotics datasets. We release the benchmark along with the complete codebase at github.com/budzianowski/opengvl{OpenGVL}.

OpenGVL - データキュレーションのための視覚的時間進行のベンチマーキング

OpenGVL - Benchmarking Visual Temporal Progress for Data Curation

要旨

Support