ChatPaper.aiChatPaper

PARROT:跨系统SQL翻译中评估大语言模型的基准

PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation

September 27, 2025
作者: Wei Zhou, Guoliang Li, Haoyu Wang, Yuxing Han, Xufei Wu, Fan Wu, Xuanhe Zhou
cs.AI

摘要

大型语言模型(LLMs)在文本到SQL任务中展现出了日益增强的效能。然而,另一个紧密相关的问题——跨系统SQL翻译(亦称SQL-to-SQL),即将针对一个数据库系统(如MySQL)编写的查询适配为另一系统(如ClickHouse)的等效查询,虽具有极大的实际重要性,却仍未被充分探索。现有的SQL基准测试并不适合用于SQL-to-SQL的评估,原因在于它们(1)仅关注有限的数据库系统(通常仅为SQLite),且(2)无法捕捉众多系统特有的SQL方言(例如,自定义函数、数据类型及语法规则)。因此,本文引入了PARROT,一个实用且现实的跨系统SQL翻译基准测试。PARROT包含来自38个开源基准测试及真实商业服务的598对翻译样本,专门设计以挑战系统特定的SQL理解能力(例如,LLMs在此类任务上的平均准确率低于38.53%)。我们还提供了多个基准测试变体,包括包含28,003个翻译的PARROT-Diverse(用于广泛的语法测试)和包含5,306个代表性样本的PARROT-Simple(用于集中压力测试),覆盖了22个生产级数据库系统。为促进未来研究,我们公开了排行榜及源代码,访问地址为:https://code4db.github.io/parrot-bench/。
English
Large language models (LLMS) have shown increasing effectiveness in Text-to-SQL tasks. However, another closely related problem, Cross-System SQL Translation (a.k.a., SQL-to-SQL), which adapts a query written for one database system (e.g., MySQL) into its equivalent one for another system (e.g., ClickHouse), is of great practical importance but remains underexplored. Existing SQL benchmarks are not well-suited for SQL-to-SQL evaluation, which (1) focus on a limited set of database systems (often just SQLite) and (2) cannot capture many system-specific SQL dialects (e.g., customized functions, data types, and syntax rules). Thus, in this paper, we introduce PARROT, a Practical And Realistic BenchmaRk for CrOss-System SQL Translation. PARROT comprises 598 translation pairs from 38 open-source benchmarks and real-world business services, specifically prepared to challenge system-specific SQL understanding (e.g., LLMS achieve lower than 38.53% accuracy on average). We also provide multiple benchmark variants, including PARROT-Diverse with 28,003 translations (for extensive syntax testing) and PARROT-Simple with 5,306 representative samples (for focused stress testing), covering 22 production-grade database systems. To promote future research, we release a public leaderboard and source code at: https://code4db.github.io/parrot-bench/.
PDF32September 30, 2025