CDM:一種用於公式識別評估的公平準確可靠指標
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
September 5, 2024
作者: Bin Wang, Fan Wu, Linke Ouyang, Zhuangcheng Gu, Rui Zhang, Renqiu Xia, Bo Zhang, Conghui He
cs.AI
摘要
數學公式識別因數學表達式結構複雜且符號多樣而面臨重大挑戰。儘管公式識別模型持續進步,但其採用的BLEU和編輯距離等評估指標仍存在明顯侷限性。這些指標忽略了同一公式具有多樣化表徵形式,且對訓練數據分佈高度敏感的事實,從而導致公式識別評估存在不公平性。為此,我們提出字元檢測匹配(CDM)指標,通過設計圖像層面而非LaTex層面的評分標準來確保評估客觀性。具體而言,CDM將模型預測的LaTeX與真實標註的LaTeX公式皆渲染為圖像格式,隨後採用視覺特徵提取與定位技術進行結合空間位置信息的精確字元級匹配。這種具備空間感知能力的字元匹配方法,相較於僅依賴文本字元匹配的BLEU和編輯距離指標,能提供更準確且更公平的評估。實驗中,我們使用CDM、BLEU和ExpRate指標對多種公式識別模型進行評估。結果表明,CDM更貼近人類評估標準,並通過消除因公式表徵差異引起的偏差,為不同模型提供了更公平的比較基準。
English
Formula recognition presents significant challenges due to the complicated
structure and varied notation of mathematical expressions. Despite continuous
advancements in formula recognition models, the evaluation metrics employed by
these models, such as BLEU and Edit Distance, still exhibit notable
limitations. They overlook the fact that the same formula has diverse
representations and is highly sensitive to the distribution of training data,
thereby causing the unfairness in formula recognition evaluation. To this end,
we propose a Character Detection Matching (CDM) metric, ensuring the evaluation
objectivity by designing a image-level rather than LaTex-level metric score.
Specifically, CDM renders both the model-predicted LaTeX and the ground-truth
LaTeX formulas into image-formatted formulas, then employs visual feature
extraction and localization techniques for precise character-level matching,
incorporating spatial position information. Such a spatially-aware and
character-matching method offers a more accurate and equitable evaluation
compared with previous BLEU and Edit Distance metrics that rely solely on
text-based character matching. Experimentally, we evaluated various formula
recognition models using CDM, BLEU, and ExpRate metrics. Their results
demonstrate that the CDM aligns more closely with human evaluation standards
and provides a fairer comparison across different models by eliminating
discrepancies caused by diverse formula representations.