Chain-of-Table: Evolução de Tabelas na Cadeia de Raciocínio para Compreensão de Tabelas

Resumo

A raciocínio baseado em tabelas com modelos de linguagem de grande escala (LLMs) é uma direção promissora para abordar diversas tarefas de compreensão de tabelas, como resposta a perguntas baseadas em tabelas e verificação de fatos. Em comparação com o raciocínio genérico, o raciocínio baseado em tabelas requer a extração de semânticas subjacentes tanto de perguntas em formato livre quanto de dados tabulares semiestruturados. A abordagem Chain-of-Thought e suas variações incorporam a cadeia de raciocínio na forma de contexto textual, mas ainda é uma questão em aberto como aproveitar efetivamente os dados tabulares na cadeia de raciocínio. Propomos o framework Chain-of-Table, onde os dados tabulares são explicitamente utilizados na cadeia de raciocínio como um proxy para pensamentos intermediários. Especificamente, orientamos os LLMs usando aprendizado em contexto para gerar iterativamente operações e atualizar a tabela, representando assim uma cadeia de raciocínio tabular. Os LLMs podem, portanto, planejar dinamicamente a próxima operação com base nos resultados das anteriores. Essa evolução contínua da tabela forma uma cadeia, mostrando o processo de raciocínio para um problema tabular específico. A cadeia carrega informações estruturadas dos resultados intermediários, permitindo previsões mais precisas e confiáveis. O Chain-of-Table alcança novos recordes de desempenho de última geração nos benchmarks WikiTQ, FeTaQA e TabFact em múltiplas escolhas de LLMs.

English

Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches incorporate the reasoning chain in the form of textual context, but it is still an open question how to effectively leverage tabular data in the reasoning chain. We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts. Specifically, we guide LLMs using in-context learning to iteratively generate operations and update the table to represent a tabular reasoning chain. LLMs can therefore dynamically plan the next operation based on the results of the previous ones. This continuous evolution of the table forms a chain, showing the reasoning process for a given tabular problem. The chain carries structured information of the intermediate results, enabling more accurate and reliable predictions. Chain-of-Table achieves new state-of-the-art performance on WikiTQ, FeTaQA, and TabFact benchmarks across multiple LLM choices.

Chain-of-Table: Evolução de Tabelas na Cadeia de Raciocínio para Compreensão de Tabelas

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

Resumo

Support