推測性即席查詢

摘要

分析大型數據集需要快速的查詢執行，但在海量數據集上執行SQL查詢往往速度緩慢。本文探討了是否可以在用戶完成輸入之前就開始查詢執行，從而實現幾乎即時的結果顯示。我們提出了SpeQL系統，該系統利用大型語言模型（LLMs）基於數據庫模式、用戶過去的查詢以及其未完成的查詢來預測可能的查詢。由於精確預測查詢是不可行的，SpeQL通過兩種方式對部分查詢進行推測：1）預測查詢結構以提前編譯和計劃查詢，2）預先計算比原始數據庫小得多的臨時表，這些表仍被預測包含回答用戶最終查詢所需的所有信息。此外，SpeQL實時持續顯示推測查詢和子查詢的結果，輔助探索性分析。一項實用性/用戶研究表明，SpeQL提高了任務完成時間，參與者報告稱其推測性結果顯示幫助他們更快地發現數據中的模式。在研究中，SpeQL將用戶的查詢延遲最多降低了289倍，並將開銷控制在每小時4美元的合理範圍內。

English

Analyzing large datasets requires responsive query execution, but executing SQL queries on massive datasets can be slow. This paper explores whether query execution can begin even before the user has finished typing, allowing results to appear almost instantly. We propose SpeQL, a system that leverages Large Language Models (LLMs) to predict likely queries based on the database schema, the user's past queries, and their incomplete query. Since exact query prediction is infeasible, SpeQL speculates on partial queries in two ways: 1) it predicts the query structure to compile and plan queries in advance, and 2) it precomputes smaller temporary tables that are much smaller than the original database, but are still predicted to contain all information necessary to answer the user's final query. Additionally, SpeQL continuously displays results for speculated queries and subqueries in real time, aiding exploratory analysis. A utility/user study showed that SpeQL improved task completion time, and participants reported that its speculative display of results helped them discover patterns in the data more quickly. In the study, SpeQL improves user's query latency by up to 289times and kept the overhead reasonable, at 4$ per hour.

推測性即席查詢

Speculative Ad-hoc Querying

摘要

Support