從單一示範進行基於視覺的手勢定制

摘要

手勢識別正成為越來越普遍的人機互動模式，尤其是隨著攝像頭在日常設備中的普及。儘管在這一領域取得了持續進展，手勢定制往往被忽視。定制至關重要，因為它使用戶能夠定義和展示更自然、易記憶和易訪問的手勢。然而，定制需要有效利用用戶提供的數據。我們提出了一種方法，使用戶能夠通過單眼攝像頭輕鬆設計定制手勢，僅需一次演示。我們採用了Transformer和元學習技術來應對少樣本學習挑戰。與以往方法不同，我們的方法支持任意組合的單手、雙手、靜態和動態手勢，包括不同視角。我們通過對來自21名參與者的20個手勢進行用戶研究來評估我們的定制方法，從一次演示中實現高達97%的平均識別準確率。我們的工作為基於視覺的手勢定制提供了一條可行的途徑，為該領域未來的進一步發展奠定了基礎。

English

Hand gesture recognition is becoming a more prevalent mode of human-computer interaction, especially as cameras proliferate across everyday devices. Despite continued progress in this field, gesture customization is often underexplored. Customization is crucial since it enables users to define and demonstrate gestures that are more natural, memorable, and accessible. However, customization requires efficient usage of user-provided data. We introduce a method that enables users to easily design bespoke gestures with a monocular camera from one demonstration. We employ transformers and meta-learning techniques to address few-shot learning challenges. Unlike prior work, our method supports any combination of one-handed, two-handed, static, and dynamic gestures, including different viewpoints. We evaluated our customization method through a user study with 20 gestures collected from 21 participants, achieving up to 97% average recognition accuracy from one demonstration. Our work provides a viable path for vision-based gesture customization, laying the foundation for future advancements in this domain.

從單一示範進行基於視覺的手勢定制

Vision-Based Hand Gesture Customization from a Single Demonstration

摘要

Support