ChatPaper.aiChatPaper

GLiNER2:一款高效的多任務資訊抽取系統,具備架構驅動介面

GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface

July 24, 2025
作者: Urchade Zaratiana, Gil Pasternak, Oliver Boyd, George Hurn-Maloney, Ash Lewis
cs.AI

摘要

資訊抽取(IE)是眾多自然語言處理(NLP)應用的基礎,然而現有的解決方案往往需要針對不同任務專門設計模型,或依賴於計算成本高昂的大型語言模型。我們提出了GLiNER2,這是一個統一框架,它增強了原始GLiNER架構,以支持在單一高效模型中進行命名實體識別、文本分類及層次化結構數據抽取。基於預訓練的Transformer編碼器架構,GLiNER2在保持CPU效率與緊湊體積的同時,通過直觀的基於模式的接口引入了多任務組合能力。我們的實驗表明,在抽取與分類任務上,GLiNER2展現了競爭力的性能,並在部署便捷性方面相較於基於LLM的替代方案有顯著提升。我們將GLiNER2作為一個開源、可通過pip安裝的庫發布,並提供了預訓練模型與文檔,詳見https://github.com/fastino-ai/GLiNER2。
English
Information extraction (IE) is fundamental to numerous NLP applications, yet existing solutions often require specialized models for different tasks or rely on computationally expensive large language models. We present GLiNER2, a unified framework that enhances the original GLiNER architecture to support named entity recognition, text classification, and hierarchical structured data extraction within a single efficient model. Built pretrained transformer encoder architecture, GLiNER2 maintains CPU efficiency and compact size while introducing multi-task composition through an intuitive schema-based interface. Our experiments demonstrate competitive performance across extraction and classification tasks with substantial improvements in deployment accessibility compared to LLM-based alternatives. We release GLiNER2 as an open-source pip-installable library with pre-trained models and documentation at https://github.com/fastino-ai/GLiNER2.
PDF133July 25, 2025