LLM-R2: 쿼리 효율성 향상을 위한 대규모 언어 모델 기반 규칙 재작성 시스템

초록

쿼리 재작성(Query Rewrite)은 SQL 쿼리의 구조를 변경하면서도 쿼리 결과를 바꾸지 않고 더 효율적인 쿼리를 생성하는 것을 목표로 하는 중요한 연구 주제입니다. 전통적인 쿼리 재작성 방법은 재작성 과정에서 원본 쿼리와 재작성된 쿼리 간의 동등성을 유지하기 위해 항상 특정 재작성 규칙을 따라 쿼리를 수정합니다. 그러나 몇 가지 문제점이 여전히 존재합니다. 첫째, 최적의 재작성 규칙 선택 또는 순서를 찾는 기존 방법은 여전히 제한적이며, 이 과정은 많은 리소스를 소모합니다. 새로운 재작성 규칙을 발견하는 방법은 일반적으로 복잡한 구조적 논리 증명이나 광범위한 사용자 상호작용을 필요로 합니다. 둘째, 현재의 쿼리 재작성 방법은 종종 정확하지 않은 DBMS 비용 추정기에 크게 의존합니다. 본 논문에서는 이러한 문제를 해결하기 위해 LLM-R2라는 새로운 쿼리 재작성 방법을 제안합니다. 이 방법은 대규모 언어 모델(LLM)을 활용하여 데이터베이스 재작성 시스템을 위한 가능한 재작성 규칙을 제안합니다. LLM이 재작성 규칙을 추천하는 데 있어서의 추론 능력을 더욱 향상시키기 위해, 우리는 커리큘럼을 통해 대조 모델(Contrastive Model)을 학습시켜 쿼리 표현을 학습하고 LLM을 위한 효과적인 쿼리 데모를 선택합니다. 실험 결과는 우리의 방법이 쿼리 실행 효율성을 크게 개선하고 기준 방법들을 능가할 수 있음을 보여줍니다. 또한, 우리의 방법은 다양한 데이터셋에서 높은 견고성을 보입니다.

English

Query rewrite, which aims to generate more efficient queries by altering a SQL query's structure without changing the query result, has been an important research problem. In order to maintain equivalence between the rewritten query and the original one during rewriting, traditional query rewrite methods always rewrite the queries following certain rewrite rules. However, some problems still remain. Firstly, existing methods of finding the optimal choice or sequence of rewrite rules are still limited and the process always costs a lot of resources. Methods involving discovering new rewrite rules typically require complicated proofs of structural logic or extensive user interactions. Secondly, current query rewrite methods usually rely highly on DBMS cost estimators which are often not accurate. In this paper, we address these problems by proposing a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system. To further improve the inference ability of LLM in recommending rewrite rules, we train a contrastive model by curriculum to learn query representations and select effective query demonstrations for the LLM. Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods. In addition, our method enjoys high robustness across different datasets.

LLM-R2: 쿼리 효율성 향상을 위한 대규모 언어 모델 기반 규칙 재작성 시스템

LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency

초록

Summary

Support

Support