CDM:公式识别评估中公平且准确的可靠指标
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
September 5, 2024
作者: Bin Wang, Fan Wu, Linke Ouyang, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Bo Zhang, Conghui He
cs.AI
摘要
公式识别因数学表达式的复杂结构和多样符号而面临重大挑战。尽管公式识别模型持续进步,但这些模型采用的评估指标,如BLEU和编辑距离,仍存在显著局限。它们忽视了同一公式具有多种表示形式,且对训练数据分布高度敏感,从而导致公式识别评估的不公平性。为此,我们提出了一种字符检测匹配(CDM)指标,通过设计图像层面而非LaTeX层面的评分标准,确保评估的客观性。具体而言,CDM将模型预测的LaTeX公式与真实LaTeX公式均渲染为图像格式,随后运用视觉特征提取与定位技术进行精确的字符级匹配,并融入空间位置信息。这种空间感知与字符匹配的方法,相较于以往仅依赖文本字符匹配的BLEU和编辑距离指标,提供了更为准确和公正的评估。实验上,我们使用CDM、BLEU及ExpRate指标对多种公式识别模型进行了评估。结果表明,CDM更贴近人类评估标准,并通过消除因公式表示多样性引起的差异,为不同模型间提供了更为公平的比较。
English
Formula recognition presents significant challenges due to the complicated
structure and varied notation of mathematical expressions. Despite continuous
advancements in formula recognition models, the evaluation metrics employed by
these models, such as BLEU and Edit Distance, still exhibit notable
limitations. They overlook the fact that the same formula has diverse
representations and is highly sensitive to the distribution of training data,
thereby causing the unfairness in formula recognition evaluation. To this end,
we propose a Character Detection Matching (CDM) metric, ensuring the evaluation
objectivity by designing a image-level rather than LaTex-level metric score.
Specifically, CDM renders both the model-predicted LaTeX and the ground-truth
LaTeX formulas into image-formatted formulas, then employs visual feature
extraction and localization techniques for precise character-level matching,
incorporating spatial position information. Such a spatially-aware and
character-matching method offers a more accurate and equitable evaluation
compared with previous BLEU and Edit Distance metrics that rely solely on
text-based character matching. Experimentally, we evaluated various formula
recognition models using CDM, BLEU, and ExpRate metrics. Their results
demonstrate that the CDM aligns more closely with human evaluation standards
and provides a fairer comparison across different models by eliminating
discrepancies caused by diverse formula representations.