ChatPaper.aiChatPaper

GLiClass:面向序列分類任務的通用輕量級模型

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

August 11, 2025
作者: Ihor Stepanov, Mykhailo Shtopko, Dmytro Vodianytskyi, Oleksandr Lukashov, Alexander Yavorskyi, Mykyta Yaroshenko
cs.AI

摘要

分類是人工智慧應用中最普遍的任務之一,通常作為過濾、排序和分類數據的第一步。由於現代人工智慧系統必須處理大量輸入數據,且早期處理階段的錯誤可能會傳播至下游,因此實現高效率和準確性至關重要。此外,分類需求可能根據用戶需求動態變化,這就需要模型具備強大的零樣本能力。雖然生成式大型語言模型(LLMs)因其多功能性已成為零樣本分類的主流方法,但它們存在指令遵循不一致和計算效率低下的問題。交叉編碼器(Cross-encoders)通常用於RAG管道中的重新排序,但面臨不同的瓶頸:它們必須按順序處理文本-標籤對,這在處理大規模標籤集時顯著降低了效率。基於嵌入的方法提供了良好的效率,但在涉及邏輯和語義約束的複雜場景中表現不佳。我們提出了GLiClass,這是一種將GLiNER架構應用於序列分類任務的新方法。我們的方法在保持與基於嵌入方法相當的準確性和效率的同時,還具備零樣本和少樣本學習場景所需的靈活性。此外,我們將近端策略優化(PPO)應用於多標籤文本分類,使得在數據稀疏條件下或基於人類反饋訓練分類器成為可能。
English
Classification is one of the most widespread tasks in AI applications, serving often as the first step in filtering, sorting, and categorizing data. Since modern AI systems must handle large volumes of input data and early pipeline stages can propagate errors downstream, achieving high efficiency and accuracy is critical. Moreover, classification requirements can change dynamically based on user needs, necessitating models with strong zero-shot capabilities. While generative LLMs have become mainstream for zero-shot classification due to their versatility, they suffer from inconsistent instruction following and computational inefficiency. Cross-encoders, commonly used as rerankers in RAG pipelines, face a different bottleneck: they must process text-label pairs sequentially, significantly reducing efficiency with large label sets. Embedding-based approaches offer good efficiency but struggle with complex scenarios involving logical and semantic constraints. We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks. Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios. Additionally, we adapted proximal policy optimization (PPO) for multi-label text classification, enabling training classifiers in data-sparse conditions or from human feedback.
PDF72August 12, 2025