SAFE-SQL：テキストからSQLへの変換のための細かい例選択を伴う自己増強インコンテキスト学習

要旨

Text-to-SQLは、自然言語の質問を実行可能なSQLクエリに変換することを目指しています。従来の手法であるスケルトンマスク選択などは、大規模言語モデル（LLM）を導くために類似の訓練例を取得することで強力なパフォーマンスを示してきましたが、そのような例が利用できない実世界のシナリオでは苦戦しています。この制限を克服するために、私たちはText-to-SQL向けのSelf-Augmentation in-context learning with Fine-grained Example selection（SAFE-SQL）を提案しています。これは、自己拡張例を生成およびフィルタリングすることでSQL生成を改善する新しいフレームワークです。SAFE-SQLはまず、LLMに対してテスト入力に関連する複数のText-to-SQL例を生成するよう促します。その後、SAFE-SQLはこれらの例を3つの関連性評価を通じてフィルタリングし、高品質なインコンテキスト学習例を構築します。自己生成例を使用することで、SAFE-SQLは従来のゼロショットおよびフューショットText-to-SQLフレームワークを凌駕し、より高い実行精度を達成します。特筆すべきは、私たちの手法が、従来の方法がしばしば失敗する非常に困難で未知のシナリオにおいて追加のパフォーマンス向上を提供することです。

English

Text-to-SQL aims to convert natural language questions into executable SQL queries. While previous approaches, such as skeleton-masked selection, have demonstrated strong performance by retrieving similar training examples to guide large language models (LLMs), they struggle in real-world scenarios where such examples are unavailable. To overcome this limitation, we propose Self-Augmentation in-context learning with Fine-grained Example selection for Text-to-SQL (SAFE-SQL), a novel framework that improves SQL generation by generating and filtering self-augmented examples. SAFE-SQL first prompts an LLM to generate multiple Text-to-SQL examples relevant to the test input. Then SAFE-SQL filters these examples through three relevance assessments, constructing high-quality in-context learning examples. Using self-generated examples, SAFE-SQL surpasses the previous zero-shot, and few-shot Text-to-SQL frameworks, achieving higher execution accuracy. Notably, our approach provides additional performance gains in extra hard and unseen scenarios, where conventional methods often fail.

SAFE-SQL：テキストからSQLへの変換のための細かい例選択を伴う自己増強インコンテキスト学習

SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL

要旨

Support