ChatPaper.aiChatPaper

ForCenNet:面向文档图像校正的前景中心网络

ForCenNet: Foreground-Centric Network for Document Image Rectification

July 26, 2025
作者: Peng Cai, Qiang Li, Kaicheng Yang, Dong Guo, Jia Li, Nan Zhou, Xiang An, Ninghua Yang, Jiankang Deng
cs.AI

摘要

文档图像校正旨在消除拍摄文档中的几何变形,以便于文本识别。然而,现有方法往往忽视了前景元素的重要性,这些元素为文档图像校正提供了关键的几何参考和布局信息。本文中,我们引入了前景中心网络(ForCenNet)来消除文档图像中的几何失真。具体而言,我们首先提出了一种前景中心标签生成方法,该方法从未失真的图像中提取详细的前景元素。随后,我们引入了一种前景中心掩码机制,以增强可读区域与背景区域之间的区分度。此外,我们设计了一种曲率一致性损失,利用详细的前景标签帮助模型理解失真的几何分布。大量实验表明,ForCenNet在DocUNet、DIR300、WarpDoc和DocReal四个真实世界基准测试中达到了新的最先进水平。定量分析显示,所提方法有效地校正了文本行和表格边框等布局元素。进一步的比较资源已发布于https://github.com/caipeng328/ForCenNet。
English
Document image rectification aims to eliminate geometric deformation in photographed documents to facilitate text recognition. However, existing methods often neglect the significance of foreground elements, which provide essential geometric references and layout information for document image correction. In this paper, we introduce Foreground-Centric Network (ForCenNet) to eliminate geometric distortions in document images. Specifically, we initially propose a foreground-centric label generation method, which extracts detailed foreground elements from an undistorted image. Then we introduce a foreground-centric mask mechanism to enhance the distinction between readable and background regions. Furthermore, we design a curvature consistency loss to leverage the detailed foreground labels to help the model understand the distorted geometric distribution. Extensive experiments demonstrate that ForCenNet achieves new state-of-the-art on four real-world benchmarks, such as DocUNet, DIR300, WarpDoc, and DocReal. Quantitative analysis shows that the proposed method effectively undistorts layout elements, such as text lines and table borders. The resources for further comparison are provided at https://github.com/caipeng328/ForCenNet.
PDF92July 29, 2025