意大利计算语言学十年历程:CLiC-it语料库
Charting a Decade of Computational Linguistics in Italy: The CLiC-it Corpus
September 23, 2025
作者: Chiara Alzetta, Serena Auriemma, Alessandro Bondielli, Luca Dini, Chiara Fazzone, Alessio Miaschi, Martina Miliani, Marta Sartor
cs.AI
摘要
过去十年间,计算语言学(CL)与自然语言处理(NLP)领域发展迅猛,尤其是随着基于Transformer架构的大规模语言模型(LLMs)的出现。这一转变重塑了研究目标与优先级,从词汇与语义资源转向了语言建模及多模态研究。本研究通过分析意大利CL与NLP领域内领先会议CLiC-it的投稿情况,追踪了该领域的研究趋势。我们将CLiC-it会议前10届(2014年至2024年)的论文集汇编成CLiC-it语料库,对其元数据(包括作者来源、性别、所属机构等)以及论文内容(涵盖多样主题)进行了全面分析。旨在为意大利乃至国际研究界提供关于该领域随时间演变的趋势洞察与关键进展,支持该领域内明智的决策制定与未来研究方向。
English
Over the past decade, Computational Linguistics (CL) and Natural Language
Processing (NLP) have evolved rapidly, especially with the advent of
Transformer-based Large Language Models (LLMs). This shift has transformed
research goals and priorities, from Lexical and Semantic Resources to Language
Modelling and Multimodality. In this study, we track the research trends of the
Italian CL and NLP community through an analysis of the contributions to
CLiC-it, arguably the leading Italian conference in the field. We compile the
proceedings from the first 10 editions of the CLiC-it conference (from 2014 to
2024) into the CLiC-it Corpus, providing a comprehensive analysis of both its
metadata, including author provenance, gender, affiliations, and more, as well
as the content of the papers themselves, which address various topics. Our goal
is to provide the Italian and international research communities with valuable
insights into emerging trends and key developments over time, supporting
informed decisions and future directions in the field.