GLiClass: 시퀀스 분류 작업을 위한 범용 경량 모델

초록

분류(Classification)는 AI 응용 분야에서 가장 널리 사용되는 작업 중 하나로, 데이터를 필터링, 정렬 및 범주화하는 첫 번째 단계로 자주 활용됩니다. 현대 AI 시스템은 대량의 입력 데이터를 처리해야 하며, 초기 파이프라인 단계에서 발생한 오류가 후속 단계로 전파될 수 있기 때문에 높은 효율성과 정확도를 달성하는 것이 중요합니다. 또한, 분류 요구사항은 사용자 요구에 따라 동적으로 변화할 수 있어, 강력한 제로샷(zero-shot) 능력을 갖춘 모델이 필요합니다. 생성형 대형 언어 모델(Generative LLMs)은 다재다능함으로 인해 제로샷 분류에서 주류로 자리 잡았지만, 지시 사항을 일관되게 따르지 못하고 계산 효율성이 낮다는 단점이 있습니다. RAG 파이프라인에서 리랭커(reranker)로 흔히 사용되는 크로스 인코더(Cross-encoders)는 다른 문제에 직면해 있습니다. 이들은 텍스트-레이블 쌍을 순차적으로 처리해야 하기 때문에 레이블 집합이 클 경우 효율성이 크게 저하됩니다. 임베딩 기반 접근법은 좋은 효율성을 제공하지만, 논리적 및 의미적 제약이 포함된 복잡한 시나리오에서는 어려움을 겪습니다. 본 연구에서는 GLiNER 아키텍처를 시퀀스 분류 작업에 적용한 새로운 방법인 GLiClass를 제안합니다. 이 방법은 임베딩 기반 방법과 비슷한 수준의 강력한 정확도와 효율성을 달성하면서도, 제로샷 및 퓨샷(few-shot) 학습 시나리오에 필요한 유연성을 유지합니다. 또한, 다중 레이블 텍스트 분류를 위해 근접 정책 최적화(Proximal Policy Optimization, PPO)를 적용하여 데이터가 희소한 조건이나 인간 피드백을 통해 분류기를 학습할 수 있도록 했습니다.

English

Classification is one of the most widespread tasks in AI applications, serving often as the first step in filtering, sorting, and categorizing data. Since modern AI systems must handle large volumes of input data and early pipeline stages can propagate errors downstream, achieving high efficiency and accuracy is critical. Moreover, classification requirements can change dynamically based on user needs, necessitating models with strong zero-shot capabilities. While generative LLMs have become mainstream for zero-shot classification due to their versatility, they suffer from inconsistent instruction following and computational inefficiency. Cross-encoders, commonly used as rerankers in RAG pipelines, face a different bottleneck: they must process text-label pairs sequentially, significantly reducing efficiency with large label sets. Embedding-based approaches offer good efficiency but struggle with complex scenarios involving logical and semantic constraints. We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks. Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios. Additionally, we adapted proximal policy optimization (PPO) for multi-label text classification, enabling training classifiers in data-sparse conditions or from human feedback.

GLiClass: 시퀀스 분류 작업을 위한 범용 경량 모델

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

초록

Support