生物图像自动形态特征标注系统

摘要

形態特徵作為生物體的表型屬性，是揭示生物與環境互作機制的關鍵線索。然而當前特徵提取仍依賴耗時的人工標註，制約了其在大規模生態研究中的應用。核心瓶頸在於缺乏鏈接生物圖像與特徵註釋的高質量數據集。本研究提出，基於基礎模型特徵訓練的稀疏自編碼器能夠產生單義性、空間可定位的神經元，這些神經元能持續對有意義的形態結構產生激活響應。利用此特性，我們開發了一套特徵註釋流程：先定位顯著區域，再通過視覺語言提示生成可解釋的特徵描述。據此構建的BioScan-Traits數據集包含來自BIOSCAN-5M的1.9萬張昆蟲圖像對應的8萬條特徵註釋，人工評估證實生成描述的生物合理性。通過系統性消融實驗評估設計敏感性，量化關鍵設計選擇對特徵描述質量的影響。這種模塊化註釋流程突破了傳統人工標註的成本限制，為基礎模型注入生物學意義監督提供了可擴展路徑，既支持大規模形態分析，更在生態關聯性與機器學習實用性之間架設了橋樑。

English

Morphological traits are physical characteristics of biological organisms that provide vital clues on how organisms interact with their environment. Yet extracting these traits remains a slow, expert-driven process, limiting their use in large-scale ecological studies. A major bottleneck is the absence of high-quality datasets linking biological images to trait-level annotations. In this work, we demonstrate that sparse autoencoders trained on foundation-model features yield monosemantic, spatially grounded neurons that consistently activate on meaningful morphological parts. Leveraging this property, we introduce a trait annotation pipeline that localizes salient regions and uses vision-language prompting to generate interpretable trait descriptions. Using this approach, we construct Bioscan-Traits, a dataset of 80K trait annotations spanning 19K insect images from BIOSCAN-5M. Human evaluation confirms the biological plausibility of the generated morphological descriptions. We assess design sensitivity through a comprehensive ablation study, systematically varying key design choices and measuring their impact on the quality of the resulting trait descriptions. By annotating traits with a modular pipeline rather than prohibitively expensive manual efforts, we offer a scalable way to inject biologically meaningful supervision into foundation models, enable large-scale morphological analyses, and bridge the gap between ecological relevance and machine-learning practicality.

生物图像自动形态特征标注系统

Automatic Image-Level Morphological Trait Annotation for Organismal Images

摘要

Support