ChatPaper.aiChatPaper

從弱和強LML中合成文本到SQL數據

Synthesizing Text-to-SQL Data from Weak and Strong LLMs

August 6, 2024
作者: Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, Chang Zhou
cs.AI

摘要

在文本轉SQL任務中,開源和封閉源大型語言模型(LLMs)之間的能力差距仍然是一個挑戰。本文介紹了一種合成數據方法,該方法將由更大、更強大的模型(強模型)生成的數據與由較小、不太對齊的模型(弱模型)生成的錯誤信息數據相結合。該方法不僅增強了文本轉SQL模型的領域泛化能力,還通過偏好學習探索了錯誤數據監督的潛力。此外,我們應用了這種合成數據方法來調整開源LLMs,從而產生了SENSE,一種專用的文本轉SQL模型。SENSE的有效性通過在SPIDER和BIRD基準測試中展示的最新成果得到證明,彌合了開源模型和封閉源模型引發的方法之間的性能差距。
English
The capability gap between open-source and closed-source large language models (LLMs) remains a challenge in text-to-SQL tasks. In this paper, we introduce a synthetic data approach that combines data produced by larger, more powerful models (strong models) with error information data generated by smaller, not well-aligned models (weak models). The method not only enhances the domain generalization of text-to-SQL models but also explores the potential of error data supervision through preference learning. Furthermore, we employ the synthetic data approach for instruction tuning on open-source LLMs, resulting SENSE, a specialized text-to-SQL model. The effectiveness of SENSE is demonstrated through state-of-the-art results on the SPIDER and BIRD benchmarks, bridging the performance gap between open-source models and methods prompted by closed-source models.

Summary

AI-Generated Summary

PDF112November 28, 2024