arXiVeri：使用GPT進行自動表格驗證

摘要

在科學文件中，若數據未經準確轉錄，科學家將無法得出準確的結論。不幸的是，從一份文件複製數據到另一份文件的過程容易出現人為錯誤。本文提出通過自動表格驗證（AutoTV）這一新型任務來應對這一挑戰，其目標是通過交叉參考引用來驗證表格中數據的準確性。為支持此任務，我們提出了一個新的基準，arXiVeri，其中包含從arXiv開放訪問學術論文中提取的表格數據。我們引入了評估表格驗證器在兩個關鍵領域表現的指標：（i）表格匹配，旨在識別引用文件中對應於目標表格的來源表格，以及（ii）單元格匹配，旨在準確定位目標表格和來源表格之間的共享單元格，並識別其行和列索引。通過利用現代大型語言模型（LLMs）的靈活功能，我們提出了表格驗證的簡單基準。我們的研究結果突顯了這一任務的複雜性，即使對於像OpenAI的GPT-4這樣的最先進的LLMs也是如此。代碼和基準將公開提供。

English

Without accurate transcription of numerical data in scientific documents, a scientist cannot draw accurate conclusions. Unfortunately, the process of copying numerical data from one paper to another is prone to human error. In this paper, we propose to meet this challenge through the novel task of automatic table verification (AutoTV), in which the objective is to verify the accuracy of numerical data in tables by cross-referencing cited sources. To support this task, we propose a new benchmark, arXiVeri, which comprises tabular data drawn from open-access academic papers on arXiv. We introduce metrics to evaluate the performance of a table verifier in two key areas: (i) table matching, which aims to identify the source table in a cited document that corresponds to a target table, and (ii) cell matching, which aims to locate shared cells between a target and source table and identify their row and column indices accurately. By leveraging the flexible capabilities of modern large language models (LLMs), we propose simple baselines for table verification. Our findings highlight the complexity of this task, even for state-of-the-art LLMs like OpenAI's GPT-4. The code and benchmark will be made publicly available.

arXiVeri：使用GPT進行自動表格驗證

arXiVeri: Automatic table verification with GPT

摘要

Support