TabDSR:面向表格数据复杂数值推理的分解、清理与推演框架
TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data
November 4, 2025
作者: Changjiang Jiang, Fengchang Yu, Haihua Chen, Wei Lu, Jin Zeng
cs.AI
摘要
针对现实数据分析中表格数据复杂推理的重要性,大型语言模型常因复杂查询、数据噪声及数值处理能力不足而表现不佳。为此,我们提出\method框架,其包含三大组件:(1)用于分解复杂问题的查询解析器;(2)清洗过滤噪声表格的数据净化器;(3)基于程序化思维(PoT)的推理器,通过生成可执行代码从净化表格中推导最终答案。为确保无偏评估并避免数据泄露,我们专门设计了面向表格复杂数值推理的新数据集CalTab151。实验结果表明,\method在TAT-QA、TableBench和\method数据集上分别以8.79%、6.08%和19.87%的准确率提升持续超越现有方法,达到最优性能。该框架可无缝集成主流大型语言模型,为复杂表格数值推理提供稳健解决方案。这些发现凸显了本框架在提升大型语言模型表格数值推理能力方面的有效性。数据与代码将按需提供。
English
Complex reasoning over tabular data is crucial in real-world data analysis,
yet large language models (LLMs) often underperform due to complex queries,
noisy data, and limited numerical capabilities. To address these issues, we
propose \method, a framework consisting of: (1) a query decomposer that breaks
down complex questions, (2) a table sanitizer that cleans and filters noisy
tables, and (3) a program-of-thoughts (PoT)-based reasoner that generates
executable code to derive the final answer from the sanitized table. To ensure
unbiased evaluation and mitigate data leakage, we introduce a new dataset,
CalTab151, specifically designed for complex numerical reasoning over tables.
Experimental results demonstrate that \method consistently outperforms existing
methods, achieving state-of-the-art (SOTA) performance with 8.79%, 6.08%, and
19.87% accuracy improvement on TAT-QA, TableBench, and \method, respectively.
Moreover, our framework integrates seamlessly with mainstream LLMs, providing a
robust solution for complex tabular numerical reasoning. These findings
highlight the effectiveness of our framework in enhancing LLM performance for
complex tabular numerical reasoning. Data and code are available upon request.