ST-Raptor: LLM 기반 반구조화 테이블 질의응답

초록

실제 애플리케이션(예: 재무 보고서, 의료 기록, 거래 주문)에서 널리 사용되는 반구조화된 테이블은 종종 유연하고 복잡한 레이아웃(예: 계층적 헤더 및 병합된 셀)을 포함합니다. 이러한 테이블은 일반적으로 인간 분석가가 테이블 레이아웃을 해석하고 관련 자연어 질문에 답변하는 데 의존하며, 이는 비용이 많이 들고 비효율적입니다. 이 절차를 자동화하기 위해 기존 방법들은 상당한 어려움에 직면합니다. 첫째, NL2SQL과 같은 방법은 반구조화된 테이블을 구조화된 테이블로 변환해야 하는데, 이는 종종 상당한 정보 손실을 초래합니다. 둘째, NL2Code 및 다중 모달 LLM QA와 같은 방법은 반구조화된 테이블의 복잡한 레이아웃을 이해하는 데 어려움을 겪으며 해당 질문에 정확하게 답변할 수 없습니다. 이를 위해, 우리는 대규모 언어 모델을 사용한 반구조화된 테이블 질의 응답을 위한 트리 기반 프레임워크인 ST-Raptor를 제안합니다. 첫째, 복잡한 반구조화된 테이블 레이아웃을 포착하는 구조적 모델인 계층적 직교 트리(HO-Tree)와 이를 구성하기 위한 효과적인 알고리즘을 소개합니다. 둘째, LLM이 일반적인 QA 작업을 실행할 수 있도록 기본 트리 작업 세트를 정의합니다. 사용자 질문이 주어지면, ST-Raptor는 이를 더 간단한 하위 질문으로 분해하고, 해당 트리 작업 파이프라인을 생성하며, 정확한 파이프라인 실행을 위해 작업-테이블 정렬을 수행합니다. 셋째, 두 단계의 검증 메커니즘을 통합합니다: 순방향 검증은 실행 단계의 정확성을 확인하고, 역방향 검증은 예측된 답변에서 쿼리를 재구성하여 답변의 신뢰성을 평가합니다. 성능을 벤치마킹하기 위해, 우리는 102개의 실제 반구조화된 테이블에 대한 764개의 질문으로 구성된 SSTQA 데이터셋을 제시합니다. 실험 결과, ST-Raptor는 9개의 베이스라인보다 최대 20% 더 높은 답변 정확도를 보였습니다. 코드는 https://github.com/weAIDB/ST-Raptor에서 확인할 수 있습니다.

English

Semi-structured tables, widely used in real-world applications (e.g., financial reports, medical records, transactional orders), often involve flexible and complex layouts (e.g., hierarchical headers and merged cells). These tables generally rely on human analysts to interpret table layouts and answer relevant natural language questions, which is costly and inefficient. To automate the procedure, existing methods face significant challenges. First, methods like NL2SQL require converting semi-structured tables into structured ones, which often causes substantial information loss. Second, methods like NL2Code and multi-modal LLM QA struggle to understand the complex layouts of semi-structured tables and cannot accurately answer corresponding questions. To this end, we propose ST-Raptor, a tree-based framework for semi-structured table question answering using large language models. First, we introduce the Hierarchical Orthogonal Tree (HO-Tree), a structural model that captures complex semi-structured table layouts, along with an effective algorithm for constructing the tree. Second, we define a set of basic tree operations to guide LLMs in executing common QA tasks. Given a user question, ST-Raptor decomposes it into simpler sub-questions, generates corresponding tree operation pipelines, and conducts operation-table alignment for accurate pipeline execution. Third, we incorporate a two-stage verification mechanism: forward validation checks the correctness of execution steps, while backward validation evaluates answer reliability by reconstructing queries from predicted answers. To benchmark the performance, we present SSTQA, a dataset of 764 questions over 102 real-world semi-structured tables. Experiments show that ST-Raptor outperforms nine baselines by up to 20% in answer accuracy. The code is available at https://github.com/weAIDB/ST-Raptor.

ST-Raptor: LLM 기반 반구조화 테이블 질의응답

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

초록

Support