ChatPaper.aiChatPaper

SLIMER-IT:意大利语零-shot 命名实体识别

SLIMER-IT: Zero-Shot NER on Italian Language

September 24, 2024
作者: Andrew Zamai, Leonardo Rigutini, Marco Maggini, Andrea Zugarini
cs.AI

摘要

传统的命名实体识别(NER)方法将任务框定为一个BIO序列标记问题。尽管这些系统在手头的下游任务中通常表现出色,但它们需要大量的标注数据,并且难以推广到超出分布输入领域和未见过的实体类型。相反,大型语言模型(LLMs)展示了强大的零-shot能力。虽然有几项工作致力于英语中的零-shot NER,但在其他语言中所做的工作很少。在本文中,我们为零-shot NER 定义了一个评估框架,并将其应用于意大利语。此外,我们介绍了SLIMER-IT,SLIMER的意大利语版本,这是一种利用富含定义和指南的提示进行调整的零-shot NER方法。与其他最先进的模型进行比较,展示了SLIMER-IT在以前从未见过的实体标签上的优越性。
English
Traditional approaches to Named Entity Recognition (NER) frame the task into a BIO sequence labeling problem. Although these systems often excel in the downstream task at hand, they require extensive annotated data and struggle to generalize to out-of-distribution input domains and unseen entity types. On the contrary, Large Language Models (LLMs) have demonstrated strong zero-shot capabilities. While several works address Zero-Shot NER in English, little has been done in other languages. In this paper, we define an evaluation framework for Zero-Shot NER, applying it to the Italian language. Furthermore, we introduce SLIMER-IT, the Italian version of SLIMER, an instruction-tuning approach for zero-shot NER leveraging prompts enriched with definition and guidelines. Comparisons with other state-of-the-art models, demonstrate the superiority of SLIMER-IT on never-seen-before entity tags.

Summary

AI-Generated Summary

PDF52November 16, 2024