UniversalNER:從大型語言模型進行有針對性的精煉,以用於開放式命名實體識別。
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition
August 7, 2023
作者: Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
cs.AI
摘要
大型語言模型(LLMs)展示了卓越的泛化能力,例如理解任意實體和關係。指導調整已被證明對提煉LLMs為更具成本效益的模型(如Alpaca和Vicuna)是有效的。然而,這些學生模型在下游應用中仍然遠遠落後於原始LLMs。在本文中,我們探索以面向任務的指導調整進行有針對性的提煉,以訓練能在廣泛應用類別(如開放信息提取)中表現出色的學生模型。通過以命名實體識別(NER)為案例研究,我們展示了如何將ChatGPT提煉為更小的UniversalNER模型,用於開放NER。為了評估,我們組建了迄今為止最大的NER基準測試,包括來自9個不同領域(如生物醫學、編程、社交媒體、法律、金融等)的43個數據集。在不使用任何直接監督的情況下,UniversalNER在數以萬計的實體類型中實現了卓越的NER準確性,平均比Alpaca和Vicuna等通用指導調整模型高出30個絕對F1分數。憑藉極少的參數,UniversalNER不僅獲得了ChatGPT在識別任意實體類型方面的能力,而且在NER準確性方面平均比其高出7-9個絕對F1分數。值得注意的是,UniversalNER甚至在很大程度上優於最先進的多任務指導調整系統(如InstructUIE),後者使用監督的NER示例。我們還進行了徹底的消融研究,以評估我們提煉方法中各個組件的影響。我們將發布提煉配方、數據和UniversalNER模型,以促進未來針對性提煉研究。
English
Large language models (LLMs) have demonstrated remarkable generalizability,
such as understanding arbitrary entities and relations. Instruction tuning has
proven effective for distilling LLMs into more cost-efficient models such as
Alpaca and Vicuna. Yet such student models still trail the original LLMs by
large margins in downstream applications. In this paper, we explore targeted
distillation with mission-focused instruction tuning to train student models
that can excel in a broad application class such as open information
extraction. Using named entity recognition (NER) for case study, we show how
ChatGPT can be distilled into much smaller UniversalNER models for open NER.
For evaluation, we assemble the largest NER benchmark to date, comprising 43
datasets across 9 diverse domains such as biomedicine, programming, social
media, law, finance. Without using any direct supervision, UniversalNER attains
remarkable NER accuracy across tens of thousands of entity types, outperforming
general instruction-tuned models such as Alpaca and Vicuna by over 30 absolute
F1 points in average. With a tiny fraction of parameters, UniversalNER not only
acquires ChatGPT's capability in recognizing arbitrary entity types, but also
outperforms its NER accuracy by 7-9 absolute F1 points in average. Remarkably,
UniversalNER even outperforms by a large margin state-of-the-art multi-task
instruction-tuned systems such as InstructUIE, which uses supervised NER
examples. We also conduct thorough ablation studies to assess the impact of
various components in our distillation approach. We will release the
distillation recipe, data, and UniversalNER models to facilitate future
research on targeted distillation.