SHARE：一种基于SLM的分层动作校正助手，用于文本到SQL转换

摘要

当前文本到SQL的自校正方法面临两大关键局限：1）传统的自校正方法依赖于大语言模型（LLM）的递归自我调用，导致计算开销成倍增加；2）LLM在处理声明式SQL查询时，难以实现有效的错误检测与校正，因为它们无法展示出底层的推理路径。本研究提出SHARE，一种基于小型语言模型（SLM）的层次化动作校正助手，旨在使LLM能够进行更精确的错误定位与高效校正。SHARE通过一个顺序管道协调三个专门化的SLM，首先将声明式SQL查询转化为揭示底层推理的逐步动作轨迹，随后进行两阶段的精细化修正。此外，我们提出了一种新颖的层次化自进化策略，以实现数据高效训练。实验结果表明，SHARE有效提升了自校正能力，并在多种LLM上展现出鲁棒性。进一步的综合分析显示，即使在低资源训练环境下，SHARE仍能保持强劲性能，这对于具有数据隐私限制的文本到SQL应用尤为宝贵。

English

Current self-correction approaches in text-to-SQL face two critical limitations: 1) Conventional self-correction methods rely on recursive self-calls of LLMs, resulting in multiplicative computational overhead, and 2) LLMs struggle to implement effective error detection and correction for declarative SQL queries, as they fail to demonstrate the underlying reasoning path. In this work, we propose SHARE, an SLM-based Hierarchical Action corREction assistant that enables LLMs to perform more precise error localization and efficient correction. SHARE orchestrates three specialized Small Language Models (SLMs) in a sequential pipeline, where it first transforms declarative SQL queries into stepwise action trajectories that reveal underlying reasoning, followed by a two-phase granular refinement. We further propose a novel hierarchical self-evolution strategy for data-efficient training. Experimental results demonstrate that SHARE effectively enhances self-correction capabilities while proving robust across various LLMs. Furthermore, our comprehensive analysis shows that SHARE maintains strong performance even in low-resource training settings, which is particularly valuable for text-to-SQL applications with data privacy constraints.

SHARE：一种基于SLM的分层动作校正助手，用于文本到SQL转换

SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL

摘要

Support