LLM 기반 엔티티 매칭을 위한 비용 효율적인 RAG: 블로킹 기반 탐구

초록

검색 증강 생성(RAG)은 지식 집약적 작업에서 대규모 언어 모델의 추론 능력을 향상시키지만, 기존 RAG 파이프라인은 대규모 개체 매칭에 적용할 경우 상당한 검색 및 생성 오버헤드가 발생합니다. 이러한 한계를 해결하기 위해 본 연구에서는 블로킹 기반 일괄 검색 및 생성을 통해 연산 비용을 절감한 비용 효율적 RAG 아키텍처인 CE-RAG4EM을 제안합니다. 또한 블로킹 인식 최적화와 검색 세분화에 초점을 맞춘 개체 매칭용 RAG 시스템 분석 및 평가를 위한 통합 프레임워크를 제시합니다. 대규모 실험 결과, CE-RAG4EM은 강력한 베이스라인 대비 종단 간 실행 시간을 상당히 단축하면서도 유사하거나 향상된 매칭 품질을 달성할 수 있음을 보여줍니다. 우리의 분석은 핵심 구성 매개변수가 성능과 오버헤드 간의 본질적 트레이드오프를 초래함을 추가로 밝혀내며, 개체 매칭 및 데이터 통합을 위한 효율적이고 확장 가능한 RAG 시스템 설계에 실용적인 지침을 제공합니다.

English

Retrieval-augmented generation (RAG) enhances LLM reasoning in knowledge-intensive tasks, but existing RAG pipelines incur substantial retrieval and generation overhead when applied to large-scale entity matching. To address this limitation, we introduce CE-RAG4EM, a cost-efficient RAG architecture that reduces computation through blocking-based batch retrieval and generation. We also present a unified framework for analyzing and evaluating RAG systems for entity matching, focusing on blocking-aware optimizations and retrieval granularity. Extensive experiments suggest that CE-RAG4EM can achieve comparable or improved matching quality while substantially reducing end-to-end runtime relative to strong baselines. Our analysis further reveals that key configuration parameters introduce an inherent trade-off between performance and overhead, offering practical guidance for designing efficient and scalable RAG systems for entity matching and data integration.

LLM 기반 엔티티 매칭을 위한 비용 효율적인 RAG: 블로킹 기반 탐구

Cost-Efficient RAG for Entity Matching with LLMs: A Blocking-based Exploration

초록

Support