R^3-SQL:排序獎勵與重新取樣之文字轉SQL
R^3-SQL: Ranking Reward and Resampling for Text-to-SQL
April 28, 2026
作者: Hojae Han, Yeonseok Jeong, Seung-won Hwang, Zhewei Yao, Yuxiong He
cs.AI
摘要
現代文字轉SQL系統會生成多個候選SQL查詢並對其進行排序,以判斷最終預測結果。然而,現有方法面臨兩個限制。首先,它們經常對功能等效的SQL查詢給出不一致的分數,儘管執行結果相同。其次,當正確的SQL不在候選池中時,排序無法恢復。我們提出R^3-SQL,一個透過統一獎勵機制結合排序與重新取樣來解決這兩個問題的文字轉SQL框架。R^3-SQL首先根據執行結果將候選查詢分組,並對分組進行一致性排序。為評估每個分組,它結合跨分組的成對偏好與來自最佳分組排名及規模的點狀效用,從而捕捉相對偏好、一致性與候選品質。為提升候選召回率,R^3-SQL引入智能重新取樣,對生成的候選池進行判斷,並在正確SQL可能缺失時選擇性地重新取樣。R^3-SQL在BIRD-dev上達到75.03的執行準確率,在使用公開規模模型的方法中創下新的最佳成績,並在五個基準測試中取得一致的效能提升。
English
Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ranking cannot recover when the correct SQL is absent from the candidate pool. We propose R^3-SQL, a Text-to-SQL framework that addresses both issues through unified reward for ranking and resampling. R^3-SQL first groups candidates by execution result and ranks groups for consistency. To score each group, it combines a pairwise preference across groups with a pointwise utility from the best group rank and size, capturing relative preference, consistency, and candidate quality. To improve candidate recall, R^3-SQL introduces agentic resampling, which judges the generated candidate pool and selectively resamples when the correct SQL is likely absent. R^3-SQL achieves 75.03 execution accuracy on BIRD-dev, a new state of the art among methods using models with disclosed sizes, with consistent gains across five benchmarks.