临床实体识别基准数据集
Named Clinical Entity Recognition Benchmark
October 7, 2024
作者: Wadood M Abdul, Marco AF Pimentel, Muhammad Umar Salman, Tathagata Raha, Clément Christophe, Praveen K Kanithi, Nasir Hayat, Ronnie Rajan, Shadab Khan
cs.AI
摘要
本技术报告介绍了一项命名临床实体识别基准,用于评估医疗保健领域的语言模型,解决了从临床叙述中提取结构化信息的关键自然语言处理(NLP)任务,以支持自动编码、临床试验队列识别和临床决策支持等应用。
排行榜提供了一个标准化平台,用于评估不同语言模型(包括编码器和解码器架构)在识别和分类跨多个医学领域的临床实体方面的能力。利用了一组经过精心筛选的开放可用的临床数据集,涵盖疾病、症状、药物、程序和实验室测量等实体。重要的是,这些实体根据观察性医学结果合作伙伴关系(OMOP)通用数据模型进行了标准化,确保在不同医疗保健系统和数据集之间的一致性和互操作性,并对模型性能进行全面评估。模型的性能主要通过F1分数进行评估,并辅以各种评估模式,以提供对模型性能的全面洞察。报告还包括对迄今为止评估的模型的简要分析,突出观察到的趋势和局限性。
通过建立这一基准框架,排行榜旨在促进透明度,促进比较分析,并推动临床实体识别任务的创新,解决医疗保健NLP中健壮评估方法的需求。
English
This technical report introduces a Named Clinical Entity Recognition
Benchmark for evaluating language models in healthcare, addressing the crucial
natural language processing (NLP) task of extracting structured information
from clinical narratives to support applications like automated coding,
clinical trial cohort identification, and clinical decision support.
The leaderboard provides a standardized platform for assessing diverse
language models, including encoder and decoder architectures, on their ability
to identify and classify clinical entities across multiple medical domains. A
curated collection of openly available clinical datasets is utilized,
encompassing entities such as diseases, symptoms, medications, procedures, and
laboratory measurements. Importantly, these entities are standardized according
to the Observational Medical Outcomes Partnership (OMOP) Common Data Model,
ensuring consistency and interoperability across different healthcare systems
and datasets, and a comprehensive evaluation of model performance. Performance
of models is primarily assessed using the F1-score, and it is complemented by
various assessment modes to provide comprehensive insights into model
performance. The report also includes a brief analysis of models evaluated to
date, highlighting observed trends and limitations.
By establishing this benchmarking framework, the leaderboard aims to promote
transparency, facilitate comparative analyses, and drive innovation in clinical
entity recognition tasks, addressing the need for robust evaluation methods in
healthcare NLP.Summary
AI-Generated Summary