SIMSPINE:面向三維脊柱運動標註與基準測試的生物力學感知仿真框架
SIMSPINE: A Biomechanics-Aware Simulation Framework for 3D Spine Motion Annotation and Benchmarking
February 24, 2026
作者: Muhammad Saif Ullah Khan, Didier Stricker
cs.AI
摘要
脊柱运动建模是理解人体生物力学的基礎,然而由於脊柱複雜的多關節運動特性以及大規模三維標註數據的缺失,該領域在計算機視覺中仍未被充分探索。我們提出一種生物力學感知的關鍵點模擬框架,通過從肌肉骨骼模型推導出解剖學一致的三維脊柱關鍵點,對現有人體姿態數據集進行擴充。基於此框架,我們創建了首個開放數據集SIMSPINE,該數據集為無外部約束的室內多相機採集環境下,自然全身運動提供稀疏的椎骨級三維脊柱標註。包含214萬幀數據的SIMSPINE能夠從細微姿勢變化中實現數據驅動的椎骨運動學學習,彌合了肌肉骨骼模擬與計算機視覺領域間的鴻溝。此外,我們發布了預訓練基線模型,涵蓋微調後的二維檢測器、單目三維姿態提升模型和多視角重建流程,為生物力學有效的脊柱運動估計建立了統一基準。具體而言,我們的二維脊柱基線在受控環境中將最先進水平的AUC從0.63提升至0.80,在自然場景脊柱追蹤中將AP從0.91提升至0.93。該模擬框架與SIMSPINE數據集共同推動了基於視覺的生物力學、運動分析與數字人建模研究,實現了自然條件下可重現、解剖學基礎紮實的三維脊柱估計。
English
Modeling spinal motion is fundamental to understanding human biomechanics, yet remains underexplored in computer vision due to the spine's complex multi-joint kinematics and the lack of large-scale 3D annotations. We present a biomechanics-aware keypoint simulation framework that augments existing human pose datasets with anatomically consistent 3D spinal keypoints derived from musculoskeletal modeling. Using this framework, we create the first open dataset, named SIMSPINE, which provides sparse vertebra-level 3D spinal annotations for natural full-body motions in indoor multi-camera capture without external restraints. With 2.14 million frames, this enables data-driven learning of vertebral kinematics from subtle posture variations and bridges the gap between musculoskeletal simulation and computer vision. In addition, we release pretrained baselines covering fine-tuned 2D detectors, monocular 3D pose lifting models, and multi-view reconstruction pipelines, establishing a unified benchmark for biomechanically valid spine motion estimation. Specifically, our 2D spine baselines improve the state-of-the-art from 0.63 to 0.80 AUC in controlled environments, and from 0.91 to 0.93 AP for in-the-wild spine tracking. Together, the simulation framework and SIMSPINE dataset advance research in vision-based biomechanics, motion analysis, and digital human modeling by enabling reproducible, anatomically grounded 3D spine estimation under natural conditions.