ChatPaper.aiChatPaper

MTSQL-R1:通过代理训练实现长程多轮文本到SQL的转换

MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

October 12, 2025
作者: Taicheng Guo, Hai Wang, ChaoChun Liu, Mohsen Golalikhani, Xin Chen, Xiangliang Zhang, Chandan K. Reddy
cs.AI

摘要

多轮文本转SQL旨在将用户的对话式表达转化为可执行的SQL查询,同时保持对话的连贯性并与目标数据库模式相契合。然而,现有系统大多仅将此任务视为简单的文本翻译,遵循短视范式,逐轮生成查询而不进行执行、显式验证和优化,导致输出不可执行或不连贯。我们提出了MTSQL-R1,一个面向长视域多轮文本转SQL的代理训练框架。我们将该任务建模为马尔可夫决策过程(MDP),其中代理与(i)数据库交互以获取执行反馈,(ii)持久对话记忆以进行连贯性验证,执行“提议执行->验证->优化”的迭代循环,直至所有检查通过。在COSQL和SPARC上的实验表明,MTSQL-R1持续超越强基线,凸显了环境驱动验证和记忆引导优化在对话式语义解析中的重要性。完整方案(包括代码、训练模型、日志、推理轨迹等)将在内部评审后发布,以助力社区研究。
English
Multi-turn Text-to-SQL aims to translate a user's conversational utterances into executable SQL while preserving dialogue coherence and grounding to the target schema. However, most existing systems only regard this task as a simple text translation task and follow a short-horizon paradigm, generating a query per turn without execution, explicit verification, and refinement, which leads to non-executable or incoherent outputs. We present MTSQL-R1, an agentic training framework for long-horizon multi-turn Text-to-SQL. We cast the task as a Markov Decision Process (MDP) in which an agent interacts with (i) a database for execution feedback and (ii) a persistent dialogue memory for coherence verification, performing an iterative propose to execute -> verify -> refine cycle until all checks pass. Experiments on COSQL and SPARC demonstrate that MTSQL-R1 consistently outperforms strong baselines, highlighting the importance of environment-driven verification and memory-guided refinement for conversational semantic parsing. Full recipes (including code, trained models, logs, reasoning trajectories, etc.) will be released after the internal review to contribute to community research.
PDF22October 16, 2025