인스턴스 분할을 이용한 픽셀 수준의 포장 손상 평가

초록

자동화된 포장도로 손상 평가는 이미지 수준 분류나 거친 경계 상자 검출 이상을 요구하며, 유지보수 관련 정량화에 필요한 기하학적 정밀도를 달성하기 위해 얇고 가지가 갈라지며 불규칙한 균열의 정밀한 위치 파악을 필요로 한다. 본 논문은 Mask R-CNN 인스턴스 분할에 기반한 비전 기반 포장도로 손상 분석 시스템을 제시하고, 차량 장착 스마트폰으로 수집되고 종방향 균열, 횡방향 균열, 악어가죽 균열 및 포트홀에 대해 다각형 레이블로 수동 주석이 달린 자체 현장 수집 도로 이미지 데이터셋인 UWGB-StreetCrack에서 이를 평가한다. 일관된 미세 조정 프로토콜 하에서 다섯 가지 Detectron2 기반 Mask R-CNN 백본 변형을 고려하였다. 가장 우수한 성능을 보인 모델인 ResNet-101 FPN 백본을 갖춘 Mask R-CNN은 프로젝트 특정 경계 상자 매칭 프로토콜 하에서 84.23%의 정밀도, 90.04%의 재현율, 87.04%의 F1 점수를 달성했다. 동일 모델은 2.164%의 총 예측 균열 면적 비율을 산출하였으며, 이는 2.170%의 실제 균열 면적 비율과 밀접하게 일치한다. 분할 시스템을 검출기 중심 대안과의 맥락에서 비교하기 위해, CSPDarknet53 기반 YOLO 검출기도 데이터셋에 맞게 조정 및 재훈련되었으며, 검증 프로토콜에서 27.5%의 정밀도와 20.7%의 재현율을 달성했다. 결과는 인스턴스 분할이 현장 포장도로 영상 및 총 균열 면적 추정을 위한 실용적인 방향임을 보여주는 동시에, 주석 일관성, 클래스 불균형, 혼란 변수 제거 및 마스크 수준 벤치마킹에서의 해결되지 않은 과제들을 드러낸다.

English

Automated pavement distress assessment requires more than image-level classification or coarse bounding box detection, demanding precise localization of thin, branching, and irregular cracks to achieve the geometric precision necessary for maintenance-relevant quantification. This paper presents a vision-based pavement distress analysis system based on Mask R-CNN instance segmentation and evaluates it on UWGB-StreetCrack, a custom field-collected roadway image dataset acquired with a vehicle-mounted smartphone and manually annotated with polygon labels for longitudinal cracks, transverse cracks, alligator cracks, and potholes. Five Detectron2-based Mask R-CNN backbone variants were considered under a consistent fine-tuning protocol. The best-performing model, Mask R-CNN with a ResNet-101 FPN backbone, achieved 84.23% precision, 90.04% recall, and an F1 score of 87.04% under the project-specific bounding-box matching protocol. The same model produced an aggregate predicted crack-area fraction of 2.164%, closely matching the 2.170% ground-truth crack-area fraction. To contextualize the segmentation system against a detector-oriented alternative, a CSPDarknet53-based YOLO detector was also adapted and retrained on the dataset, reaching 27.5% precision and 20.7% recall on the validation protocol. The results show that instance segmentation is a practical direction for field pavement imagery and aggregate crack-area estimation, while also exposing open challenges in annotation consistency, class imbalance, confounder rejection, and mask-level benchmarking.