Feather-SQL:面向小型语言模型的双模型协作轻量级NL2SQL框架
Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models
March 22, 2025
作者: Wenqi Pei, Hailing Xu, Hengyuan Zhao, Shizheng Hou, Han Chen, Zining Zhang, Pingyi Luo, Bingsheng He
cs.AI
摘要
自然语言转SQL(NL2SQL)技术随着大型语言模型(LLMs)的发展取得了显著进步。然而,这些模型往往依赖于闭源系统和高计算资源,在数据隐私和部署方面面临挑战。相比之下,小型语言模型(SLMs)在处理NL2SQL任务时表现欠佳,性能低下且与现有框架不兼容。为解决这些问题,我们推出了Feather-SQL,一个专为SLMs设计的新型轻量级框架。Feather-SQL通过1)模式剪枝与链接,2)多路径与多候选生成,提升了SQL的可执行性与准确性。此外,我们引入了1+1模型协作范式,将强大的通用聊天模型与经过微调的SQL专家模型配对,结合了强大的分析推理能力与高精度的SQL生成能力。在BIRD数据集上的实验结果表明,Feather-SQL显著提升了SLMs在NL2SQL任务上的性能,对于未经微调的模型,性能提升约10%。所提出的范式将SLMs的准确率上限提升至54.76%,充分证明了其有效性。
English
Natural Language to SQL (NL2SQL) has seen significant advancements with large
language models (LLMs). However, these models often depend on closed-source
systems and high computational resources, posing challenges in data privacy and
deployment. In contrast, small language models (SLMs) struggle with NL2SQL
tasks, exhibiting poor performance and incompatibility with existing
frameworks. To address these issues, we introduce Feather-SQL, a new
lightweight framework tailored for SLMs. Feather-SQL improves SQL executability
and accuracy through 1) schema pruning and linking, 2) multi-path and
multi-candidate generation. Additionally, we introduce the 1+1 Model
Collaboration Paradigm, which pairs a strong general-purpose chat model with a
fine-tuned SQL specialist, combining strong analytical reasoning with
high-precision SQL generation. Experimental results on BIRD demonstrate that
Feather-SQL improves NL2SQL performance on SLMs, with around 10% boost for
models without fine-tuning. The proposed paradigm raises the accuracy ceiling
of SLMs to 54.76%, highlighting its effectiveness.Summary
AI-Generated Summary