FlyPose:面向航拍视角的鲁棒人体姿态估计研究
FlyPose: Towards Robust Human Pose Estimation From Aerial Views
January 9, 2026
作者: Hassaan Farooq, Marvin Brenner, Peter St\ütz
cs.AI
摘要
随着无人机在包裹投递、交通监控、灾害响应和基础设施检测等领域的广泛应用,其与人类近距离协同作业的场景日益增多。要确保这类人机共存环境下的安全可靠运行,必须实现从空中视角精准感知人体姿态与行为。这一视角对现有方法提出了三重挑战:图像分辨率低、拍摄角度陡峭以及(自)遮挡严重,特别是在需要实时模型的应用场景中。我们研发并部署了FlyPose——一种专用于航拍图像的轻量级自上而下人体姿态估计流程。通过多数据集联合训练,在Manipal-UAV、VisDrone、HIT-UAV及我们自定义数据集的测试集上,人物检测平均精度提升6.8 mAP;在极具挑战性的UAV-Human数据集上,二维人体姿态估计精度提升16.3 mAP。FlyPose在Jetson Orin AGX开发套件上(含预处理)的推理延迟约为20毫秒,并已在四旋翼无人机飞行实验中完成机载部署。同时我们发布了FlyPose-104数据集,这个小型但极具挑战性的航拍人体姿态数据集包含从困难航拍角度的手动标注:https://github.com/farooqhassaan/FlyPose。
English
Unmanned Aerial Vehicles (UAVs) are increasingly deployed in close proximity to humans for applications such as parcel delivery, traffic monitoring, disaster response and infrastructure inspections. Ensuring safe and reliable operation in these human-populated environments demands accurate perception of human poses and actions from an aerial viewpoint. This perspective challenges existing methods with low resolution, steep viewing angles and (self-)occlusion, especially if the application demands realtime feasibile models. We train and deploy FlyPose, a lightweight top-down human pose estimation pipeline for aerial imagery. Through multi-dataset training, we achieve an average improvement of 6.8 mAP in person detection across the test-sets of Manipal-UAV, VisDrone, HIT-UAV as well as our custom dataset. For 2D human pose estimation we report an improvement of 16.3 mAP on the challenging UAV-Human dataset. FlyPose runs with an inference latency of ~20 milliseconds including preprocessing on a Jetson Orin AGX Developer Kit and is deployed onboard a quadrotor UAV during flight experiments. We also publish FlyPose-104, a small but challenging aerial human pose estimation dataset, that includes manual annotations from difficult aerial perspectives: https://github.com/farooqhassaan/FlyPose.