SEAR: LLM 게이트웨이를 위한 스키마 기반 평가 및 라우팅

초록

생산 환경의 LLM 응답 평가 및 LLM 게이트웨이에서의 다중 제공자 간 요청 라우팅은 세분화된 품질 신호와 운영 기반 결정이 필요합니다. 이러한 격차를 해결하기 위해 우리는 다중 모델, 다중 제공자 LLM 게이트웨이를 위한 스키마 기반 평가 및 라우팅 시스템인 SEAR를 제시합니다. SEAR는 LLM 평가 신호(컨텍스트, 의도, 응답 특성, 문제 귀속, 품질 점수)와 게이트웨이 운영 메트릭(지연 시간, 비용, 처리량)을 모두 포괄하는 확장 가능한 관계형 스키마를 정의하며, 약 100개의 유형화된 SQL 쿼리 가능 칼럼에 걸쳐 교차 테이블 일관성 링크를 갖춥니다. 평가 신호를 신뢰성 있게 채우기 위해 SEAR는 독립형 신호 지침, 스키마 내 추론, 데이터베이스 준비 구조화 출력을 생성하는 다단계 생성을 제안합니다. 신호가 얕은 분류기가 아닌 LLM 추론을 통해 파생되므로 SEAR는 복잡한 요청 의미를 포착하고 인간이 해석 가능한 라우팅 설명을 제공하며 단일 쿼리 계층에서 평가와 라우팅을 통합합니다. 수천 건의 생산 세션에 걸쳐 SEAR는 인간이 레이블한 데이터에서 강력한 신호 정확도를 달성하고 동등한 품질 대비 대규모 비용 절감을 포함한 실용적인 라우팅 결정을 지원합니다.

English

Evaluating production LLM responses and routing requests across providers in LLM gateways requires fine-grained quality signals and operationally grounded decisions. To address this gap, we present SEAR, a schema-based evaluation and routing system for multi-model, multi-provider LLM gateways. SEAR defines an extensible relational schema covering both LLM evaluation signals (context, intent, response characteristics, issue attribution, and quality scores) and gateway operational metrics (latency, cost, throughput), with cross-table consistency links across around one hundred typed, SQL-queryable columns. To populate the evaluation signals reliably, SEAR proposes self-contained signal instructions, in-schema reasoning, and multi-stage generation that produces database-ready structured outputs. Because signals are derived through LLM reasoning rather than shallow classifiers, SEAR captures complex request semantics, enables human-interpretable routing explanations, and unifies evaluation and routing in a single query layer. Across thousands of production sessions, SEAR achieves strong signal accuracy on human-labeled data and supports practical routing decisions, including large cost reductions with comparable quality.

SEAR: LLM 게이트웨이를 위한 스키마 기반 평가 및 라우팅

SEAR: Schema-Based Evaluation and Routing for LLM Gateways

초록

Support