GLiNER2:一款高效的多任務資訊抽取系統,具備架構驅動介面
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface
July 24, 2025
作者: Urchade Zaratiana, Gil Pasternak, Oliver Boyd, George Hurn-Maloney, Ash Lewis
cs.AI
摘要
資訊抽取(IE)是眾多自然語言處理(NLP)應用的基礎,然而現有的解決方案往往需要針對不同任務專門設計模型,或依賴於計算成本高昂的大型語言模型。我們提出了GLiNER2,這是一個統一框架,它增強了原始GLiNER架構,以支持在單一高效模型中進行命名實體識別、文本分類及層次化結構數據抽取。基於預訓練的Transformer編碼器架構,GLiNER2在保持CPU效率與緊湊體積的同時,通過直觀的基於模式的接口引入了多任務組合能力。我們的實驗表明,在抽取與分類任務上,GLiNER2展現了競爭力的性能,並在部署便捷性方面相較於基於LLM的替代方案有顯著提升。我們將GLiNER2作為一個開源、可通過pip安裝的庫發布,並提供了預訓練模型與文檔,詳見https://github.com/fastino-ai/GLiNER2。
English
Information extraction (IE) is fundamental to numerous NLP applications, yet
existing solutions often require specialized models for different tasks or rely
on computationally expensive large language models. We present GLiNER2, a
unified framework that enhances the original GLiNER architecture to support
named entity recognition, text classification, and hierarchical structured data
extraction within a single efficient model. Built pretrained transformer
encoder architecture, GLiNER2 maintains CPU efficiency and compact size while
introducing multi-task composition through an intuitive schema-based interface.
Our experiments demonstrate competitive performance across extraction and
classification tasks with substantial improvements in deployment accessibility
compared to LLM-based alternatives. We release GLiNER2 as an open-source
pip-installable library with pre-trained models and documentation at
https://github.com/fastino-ai/GLiNER2.