ST-Raptor: LLM駆動型半構造化表質問応答

要旨

半構造化テーブルは、実世界のアプリケーション（例：財務報告書、医療記録、取引注文）で広く使用されており、柔軟で複雑なレイアウト（例：階層的なヘッダーや結合されたセル）を伴うことが多い。これらのテーブルは、一般的に人間のアナリストがテーブルのレイアウトを解釈し、関連する自然言語の質問に答えることに依存しており、コストがかかり非効率的である。このプロセスを自動化するために、既存の手法は重大な課題に直面している。第一に、NL2SQLのような手法では、半構造化テーブルを構造化テーブルに変換する必要があり、これによりしばしば大幅な情報の損失が生じる。第二に、NL2CodeやマルチモーダルLLM QAのような手法は、半構造化テーブルの複雑なレイアウトを理解することが難しく、対応する質問に正確に答えることができない。このため、我々は大規模言語モデルを用いた半構造化テーブル質問応答のためのツリーベースのフレームワークであるST-Raptorを提案する。まず、複雑な半構造化テーブルのレイアウトを捉える構造モデルであるHierarchical Orthogonal Tree（HO-Tree）と、そのツリーを構築するための効果的なアルゴリズムを導入する。次に、LLMが一般的なQAタスクを実行するための基本的なツリー操作のセットを定義する。ユーザーの質問が与えられると、ST-Raptorはそれをより単純なサブ質問に分解し、対応するツリー操作パイプラインを生成し、正確なパイプライン実行のための操作-テーブルアラインメントを行う。第三に、2段階の検証メカニズムを組み込む：フォワード検証は実行ステップの正確性をチェックし、バックワード検証は予測された回答からクエリを再構築することで回答の信頼性を評価する。性能をベンチマークするために、102の実世界の半構造化テーブルに対する764の質問からなるデータセットSSTQAを提示する。実験結果は、ST-Raptorが9つのベースラインを最大20%の回答精度で上回ることを示している。コードはhttps://github.com/weAIDB/ST-Raptorで公開されている。

English

Semi-structured tables, widely used in real-world applications (e.g., financial reports, medical records, transactional orders), often involve flexible and complex layouts (e.g., hierarchical headers and merged cells). These tables generally rely on human analysts to interpret table layouts and answer relevant natural language questions, which is costly and inefficient. To automate the procedure, existing methods face significant challenges. First, methods like NL2SQL require converting semi-structured tables into structured ones, which often causes substantial information loss. Second, methods like NL2Code and multi-modal LLM QA struggle to understand the complex layouts of semi-structured tables and cannot accurately answer corresponding questions. To this end, we propose ST-Raptor, a tree-based framework for semi-structured table question answering using large language models. First, we introduce the Hierarchical Orthogonal Tree (HO-Tree), a structural model that captures complex semi-structured table layouts, along with an effective algorithm for constructing the tree. Second, we define a set of basic tree operations to guide LLMs in executing common QA tasks. Given a user question, ST-Raptor decomposes it into simpler sub-questions, generates corresponding tree operation pipelines, and conducts operation-table alignment for accurate pipeline execution. Third, we incorporate a two-stage verification mechanism: forward validation checks the correctness of execution steps, while backward validation evaluates answer reliability by reconstructing queries from predicted answers. To benchmark the performance, we present SSTQA, a dataset of 764 questions over 102 real-world semi-structured tables. Experiments show that ST-Raptor outperforms nine baselines by up to 20% in answer accuracy. The code is available at https://github.com/weAIDB/ST-Raptor.

ST-Raptor: LLM駆動型半構造化表質問応答

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

要旨

Support