FlyPose:面向航拍视角的鲁棒人体姿态估计研究
FlyPose: Towards Robust Human Pose Estimation From Aerial Views
January 9, 2026
作者: Hassaan Farooq, Marvin Brenner, Peter St\ütz
cs.AI
摘要
无人机正日益频繁地在人类活动密集区域执行包裹投递、交通监控、灾害响应和基础设施检测等任务。要确保此类人机共存环境下的安全可靠运行,必须实现从空中视角对人体姿态与行为的精准感知。这一特殊视点因图像分辨率低、拍摄角度陡峭及(自)遮挡等问题,对现有方法构成巨大挑战,特别是在需要实时模型的应用场景中。我们训练并部署了FlyPose——一种专用于航拍图像的轻量级自上而下人体姿态估计算法。通过多数据集联合训练,在Manipal-UAV、VisDrone、HIT-UAV及自建数据集的测试集上,人物检测平均精度提升6.8 mAP;在极具挑战性的UAV-Human数据集中,二维人体姿态估计精度提升16.3 mAP。FlyPose在Jetson Orin AGX开发套件上(含预处理)的推理延迟约为20毫秒,并已在四旋翼无人机飞行实验中完成机载部署。同时我们发布了FlyPose-104数据集,该小型但极具挑战性的航拍人体姿态数据集包含从困难航拍视角的手动标注:https://github.com/farooqhassaan/FlyPose。
English
Unmanned Aerial Vehicles (UAVs) are increasingly deployed in close proximity to humans for applications such as parcel delivery, traffic monitoring, disaster response and infrastructure inspections. Ensuring safe and reliable operation in these human-populated environments demands accurate perception of human poses and actions from an aerial viewpoint. This perspective challenges existing methods with low resolution, steep viewing angles and (self-)occlusion, especially if the application demands realtime feasibile models. We train and deploy FlyPose, a lightweight top-down human pose estimation pipeline for aerial imagery. Through multi-dataset training, we achieve an average improvement of 6.8 mAP in person detection across the test-sets of Manipal-UAV, VisDrone, HIT-UAV as well as our custom dataset. For 2D human pose estimation we report an improvement of 16.3 mAP on the challenging UAV-Human dataset. FlyPose runs with an inference latency of ~20 milliseconds including preprocessing on a Jetson Orin AGX Developer Kit and is deployed onboard a quadrotor UAV during flight experiments. We also publish FlyPose-104, a small but challenging aerial human pose estimation dataset, that includes manual annotations from difficult aerial perspectives: https://github.com/farooqhassaan/FlyPose.