SAFE-SQL:自我增強上下文學習與細粒度示例選擇,適用於文本到SQL
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
February 17, 2025
作者: Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hwanhee Lee
cs.AI
摘要
Text-to-SQL 旨在將自然語言問題轉換為可執行的 SQL 查詢。雖然先前的方法,如骨架遮罩選擇,通過檢索類似的訓練示例來引導大型語言模型(LLMs),已經展示出強大的性能,但在現實情況下,這些示例不可用時,它們會遇到困難。為了克服這一限制,我們提出了一種名為「自我增強上下文學習與精細篩選示例的 Text-to-SQL(SAFE-SQL)」的新框架,通過生成和篩選自我增強示例來改進 SQL 生成。SAFE-SQL 首先提示一個LLM生成多個與測試輸入相關的 Text-to-SQL 示例。然後,SAFE-SQL 通過三個相關性評估篩選這些示例,構建高質量的上下文學習示例。使用自生成的示例,SAFE-SQL 超越了先前的零-shot 和少-shot Text-to-SQL 框架,實現更高的執行準確性。值得注意的是,我們的方法在額外困難和未知情況下提供了額外的性能增益,而傳統方法通常失敗。
English
Text-to-SQL aims to convert natural language questions into executable SQL
queries. While previous approaches, such as skeleton-masked selection, have
demonstrated strong performance by retrieving similar training examples to
guide large language models (LLMs), they struggle in real-world scenarios where
such examples are unavailable. To overcome this limitation, we propose
Self-Augmentation in-context learning with Fine-grained Example selection for
Text-to-SQL (SAFE-SQL), a novel framework that improves SQL generation by
generating and filtering self-augmented examples. SAFE-SQL first prompts an LLM
to generate multiple Text-to-SQL examples relevant to the test input. Then
SAFE-SQL filters these examples through three relevance assessments,
constructing high-quality in-context learning examples. Using self-generated
examples, SAFE-SQL surpasses the previous zero-shot, and few-shot Text-to-SQL
frameworks, achieving higher execution accuracy. Notably, our approach provides
additional performance gains in extra hard and unseen scenarios, where
conventional methods often fail.Summary
AI-Generated Summary