ObjectFolderベンチマーク：ニューラルネットワークと実物体を用いたマルチセンサリ学習

要旨

私たちは、視覚、聴覚、触覚を中心とした物体認識、再構築、操作に関する10のタスクからなるマルチセンサー物体中心学習のベンチマークスイート「ObjectFolder Benchmark」を紹介します。また、100の実世界の家庭用品のマルチセンサー測定値を含む「ObjectFolder Real」データセットを発表します。これは、3Dメッシュ、ビデオ、衝突音、触覚読み取りを収集するために新たに設計されたパイプラインに基づいています。私たちは、ObjectFolderの1,000のマルチセンサーニューラルオブジェクトと、ObjectFolder Realの実マルチセンサーデータの両方に対して系統的なベンチマークを行いました。その結果、マルチセンサー知覚の重要性が示され、視覚、音声、触覚がそれぞれ異なる物体中心学習タスクにおいて果たす役割が明らかになりました。私たちは、データセットとベンチマークスイートを公開することで、コンピュータビジョン、ロボティクスをはじめとするマルチセンサー物体中心学習の新たな研究を促進し、可能にすることを目指しています。プロジェクトページ: https://objectfolder.stanford.edu

English

We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch. We also introduce the ObjectFolder Real dataset, including the multisensory measurements for 100 real-world household objects, building upon a newly designed pipeline for collecting the 3D meshes, videos, impact sounds, and tactile readings of real-world objects. We conduct systematic benchmarking on both the 1,000 multisensory neural objects from ObjectFolder, and the real multisensory data from ObjectFolder Real. Our results demonstrate the importance of multisensory perception and reveal the respective roles of vision, audio, and touch for different object-centric learning tasks. By publicly releasing our dataset and benchmark suite, we hope to catalyze and enable new research in multisensory object-centric learning in computer vision, robotics, and beyond. Project page: https://objectfolder.stanford.edu

ObjectFolderベンチマーク：ニューラルネットワークと実物体を用いたマルチセンサリ学習

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects

要旨

Support