ChatPaper.aiChatPaper

从强弱L语言模型中合成文本到SQL数据

Synthesizing Text-to-SQL Data from Weak and Strong LLMs

August 6, 2024
作者: Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, Chang Zhou
cs.AI

摘要

在文本到SQL任务中,开源和闭源大型语言模型(LLMs)之间的能力差距仍然是一个挑战。本文介绍了一种合成数据方法,该方法将由更大、更强大模型(强模型)生成的数据与由较小、不太对齐的模型(弱模型)生成的错误信息数据相结合。该方法不仅增强了文本到SQL模型的领域泛化能力,还通过偏好学习探索了错误数据监督的潜力。此外,我们利用合成数据方法对开源LLMs进行指导调优,形成了一种专门的文本到SQL模型SENSE。SENSE的有效性通过在SPIDER和BIRD基准上展示的最新结果得到证明,弥合了开源模型和闭源模型推动的方法之间的性能差距。
English
The capability gap between open-source and closed-source large language models (LLMs) remains a challenge in text-to-SQL tasks. In this paper, we introduce a synthetic data approach that combines data produced by larger, more powerful models (strong models) with error information data generated by smaller, not well-aligned models (weak models). The method not only enhances the domain generalization of text-to-SQL models but also explores the potential of error data supervision through preference learning. Furthermore, we employ the synthetic data approach for instruction tuning on open-source LLMs, resulting SENSE, a specialized text-to-SQL model. The effectiveness of SENSE is demonstrated through state-of-the-art results on the SPIDER and BIRD benchmarks, bridging the performance gap between open-source models and methods prompted by closed-source models.

Summary

AI-Generated Summary

PDF112November 28, 2024