RT-Sketch: 手描きスケッチからの目標条件付き模倣学習

要旨

自然言語と画像は、目標条件付き模倣学習（IL）において一般的に使用される目標表現です。しかし、自然言語は曖昧である場合があり、画像は過剰に特定されている場合があります。本研究では、視覚的模倣学習における目標指定のモダリティとして手描きスケッチを提案します。スケッチは、言語と同様にユーザーが即座に提供しやすい一方で、画像と同様に下流のポリシーが空間的に認識するのを助け、さらに画像を超えてタスクに関連するオブジェクトと無関係なオブジェクトを区別することができます。我々は、RT-Sketchという目標条件付きポリシーを提案します。これは、所望のシーンの手描きスケッチを入力として受け取り、行動を出力する操作ポリシーです。RT-Sketchは、ペアになった軌跡と対応する合成的に生成された目標スケッチのデータセットで訓練されます。このアプローチを、アーティキュレーテッドカウンタートップ上のテーブルトップオブジェクト再配置を含む6つの操作スキルで評価します。実験的に、RT-Sketchは、単純な設定では画像や言語条件付きエージェントと同程度の性能を発揮し、言語目標が曖昧である場合や視覚的な妨害物が存在する場合にはより高いロバスト性を達成することがわかりました。さらに、RT-Sketchは、最小限の線画から詳細なカラー図まで、さまざまなレベルの詳細さを持つスケッチを解釈し、それに基づいて行動する能力があることを示します。補足資料とビデオについては、当社のウェブサイト（http://rt-sketch.github.io）を参照してください。

English

Natural language and images are commonly used as goal representations in goal-conditioned imitation learning (IL). However, natural language can be ambiguous and images can be over-specified. In this work, we propose hand-drawn sketches as a modality for goal specification in visual imitation learning. Sketches are easy for users to provide on the fly like language, but similar to images they can also help a downstream policy to be spatially-aware and even go beyond images to disambiguate task-relevant from task-irrelevant objects. We present RT-Sketch, a goal-conditioned policy for manipulation that takes a hand-drawn sketch of the desired scene as input, and outputs actions. We train RT-Sketch on a dataset of paired trajectories and corresponding synthetically generated goal sketches. We evaluate this approach on six manipulation skills involving tabletop object rearrangements on an articulated countertop. Experimentally we find that RT-Sketch is able to perform on a similar level to image or language-conditioned agents in straightforward settings, while achieving greater robustness when language goals are ambiguous or visual distractors are present. Additionally, we show that RT-Sketch has the capacity to interpret and act upon sketches with varied levels of specificity, ranging from minimal line drawings to detailed, colored drawings. For supplementary material and videos, please refer to our website: http://rt-sketch.github.io.

RT-Sketch: 手描きスケッチからの目標条件付き模倣学習

RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches

要旨

Support