面向渔业电子监测:基于细粒度分类的鱼类视觉重识别研究
Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries
December 9, 2025
作者: Samitha Nuwan Thilakarathna, Ercan Avsar, Martin Mathias Nielsen, Malte Pedersen
cs.AI
摘要
精准的渔业数据对实现有效且可持续的海洋资源管理至关重要。随着电子监测系统的近期推广应用,视频数据采集量已远超人工审阅能力。本文通过开发优化的深度学习流水线,利用模拟传送带式电子监测系统的新型AutoFish数据集(包含六种形态相似的鱼类),实现了自动化鱼类重识别技术突破。研究表明,结合困难三元组挖掘与包含数据集特定归一化的定制图像变换流程,可显著提升重识别关键指标(R1与mAP@k)。采用上述策略后,基于视觉Transformer的Swin-T架构持续优于基于卷积神经网络的ResNet-50,最高达到41.65%的mAP@k值与90.43%的Rank-1准确率。深入分析表明,主要挑战在于区分同物种间视觉相似的个体(种内误差),其中视角不一致问题对识别效果的影响远大于局部遮挡。源代码及文档详见:https://github.com/msamdk/Fish_Re_Identification.git
English
Accurate fisheries data are crucial for effective and sustainable marine resource management. With the recent adoption of Electronic Monitoring (EM) systems, more video data is now being collected than can be feasibly reviewed manually. This paper addresses this challenge by developing an optimized deep learning pipeline for automated fish re-identification (Re-ID) using the novel AutoFish dataset, which simulates EM systems with conveyor belts with six similarly looking fish species. We demonstrate that key Re-ID metrics (R1 and mAP@k) are substantially improved by using hard triplet mining in conjunction with a custom image transformation pipeline that includes dataset-specific normalization. By employing these strategies, we demonstrate that the Vision Transformer-based Swin-T architecture consistently outperforms the Convolutional Neural Network-based ResNet-50, achieving peak performance of 41.65% mAP@k and 90.43% Rank-1 accuracy. An in-depth analysis reveals that the primary challenge is distinguishing visually similar individuals of the same species (Intra-species errors), where viewpoint inconsistency proves significantly more detrimental than partial occlusion. The source code and documentation are available at: https://github.com/msamdk/Fish_Re_Identification.git