FinTagging：面向大语言模型的金融信息提取与结构化基准测试

摘要

我们推出了FinTagging，这是首个全面、表格感知的XBRL基准测试，旨在评估大型语言模型（LLMs）在基于XBRL的财务报告场景下的结构化信息提取与语义对齐能力。与以往将XBRL标记简化为扁平多类分类且仅关注叙述性文本的基准不同，FinTagging将XBRL标记问题分解为两个子任务：FinNI用于财务实体抽取，FinCL用于基于分类法的概念对齐。该基准要求模型在非结构化文本和结构化表格中共同提取事实，并将其与包含10,000+条目的美国通用会计准则（US-GAAP）分类法全面对齐，从而实现真实、细粒度的评估。我们在零样本设置下评估了多种LLMs，系统分析了它们在两个子任务及整体标记准确率上的表现。结果表明，尽管LLMs在信息提取方面展现出强大的泛化能力，但在细粒度概念对齐上存在困难，尤其是在区分紧密相关的分类法条目时。这些发现揭示了现有LLMs在完全自动化XBRL标记方面的局限性，并强调了提升语义推理和模式感知建模能力以满足准确财务披露需求的必要性。代码可在我们的GitHub仓库获取，数据则存放于Hugging Face仓库。

English

We introduce FinTagging, the first full-scope, table-aware XBRL benchmark designed to evaluate the structured information extraction and semantic alignment capabilities of large language models (LLMs) in the context of XBRL-based financial reporting. Unlike prior benchmarks that oversimplify XBRL tagging as flat multi-class classification and focus solely on narrative text, FinTagging decomposes the XBRL tagging problem into two subtasks: FinNI for financial entity extraction and FinCL for taxonomy-driven concept alignment. It requires models to jointly extract facts and align them with the full 10k+ US-GAAP taxonomy across both unstructured text and structured tables, enabling realistic, fine-grained evaluation. We assess a diverse set of LLMs under zero-shot settings, systematically analyzing their performance on both subtasks and overall tagging accuracy. Our results reveal that, while LLMs demonstrate strong generalization in information extraction, they struggle with fine-grained concept alignment, particularly in disambiguating closely related taxonomy entries. These findings highlight the limitations of existing LLMs in fully automating XBRL tagging and underscore the need for improved semantic reasoning and schema-aware modeling to meet the demands of accurate financial disclosure. Code is available at our GitHub repository and data is at our Hugging Face repository.

FinTagging：面向大语言模型的金融信息提取与结构化基准测试

FinTagging: An LLM-ready Benchmark for Extracting and Structuring Financial Information

摘要

Support