PaddleOCR-VL-1.6: 최적화 부족 영역 정제와 점진적 사후 훈련을 통한 문서 파싱의 새 지평 확장

초록

저희는 PaddleOCR-VL-1.5를 기반으로 업그레이드된 소형 문서 파싱 모델인 PaddleOCR-VL-1.6을 소개합니다. PaddleOCR-VL-1.5는 0.9B 규모의 강력한 기준선을 확립했지만, 남아 있는 오류는 모델 동작이 불안정하고, 데이터 커버리지가 부족하거나, 감독 신호의 신뢰성이 낮은 최적화가 덜 된 영역에 집중되어 있습니다. PaddleOCR-VL-1.6은 학습 코퍼스를 무분별하게 확장하는 대신, 이전 모델에서 취약 영역을 식별하고, 해당 영역에 대해 맞춤형 개선을 적용하며, 감독 신호의 신뢰성을 향상시키는 지역 인식 데이터 최적화 프레임워크를 도입합니다. 또한, 선별된 데이터 선택과 강화 학습에 기반한 점진적 사후 훈련 방식을 채택하여, 단계적 최적화를 통해 모델 성능을 더 높은 수준으로 끌어올립니다. PaddleOCR-VL-1.6은 OmniDocBench v1.6에서 96.33%의 새로운 최첨단 점수를 달성했으며, 최상위 VLM(비전-언어 모델)과의 경쟁에서 강력한 경쟁력을 입증하고, PaddleOCR-VL 시리즈를 위한 실용적인 사후 훈련 방법을 제공합니다.

English

We introduce PaddleOCR-VL-1.6, an upgraded compact document parsing model built upon PaddleOCR-VL-1.5. Although PaddleOCR-VL-1.5 establishes a strong 0.9B baseline, its remaining errors concentrate in under-optimized regions where model behavior is unstable, data coverage is sparse, or supervision is unreliable. Rather than expanding the training corpus indiscriminately, PaddleOCR-VL-1.6 introduces a region-aware data optimization framework that identifies weak regions from the previous model, applies targeted enhancement to these regions, and improves the reliability of supervision signals. It further adopts a progressive post-training recipe based on curated data selection and reinforcement learning, pushing model performance to a higher level through staged optimization. PaddleOCR-VL-1.6 achieves a new state-of-the-art score of 96.33% on OmniDocBench v1.6, demonstrates strong competitiveness against top-tier VLMs, and provides a practical post-training recipe for the PaddleOCR-VL series.