ChatPaper.aiChatPaper

EHRCon:用于检查电子健康记录中非结构化笔记和结构化表格之间一致性的数据集

EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

June 24, 2024
作者: Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi
cs.AI

摘要

电子健康记录(EHRs)对于存储全面的患者医疗记录至关重要,将结构化数据(例如药物)与详细的临床笔记(例如医生笔记)结合在一起。这些元素对于简单的数据检索至关重要,并为深入的、具有背景信息的患者护理洞察提供支持。然而,由于不直观的EHR系统设计和人为错误,它们经常存在差异,给患者安全带来严重风险。为了解决这个问题,我们开发了EHRCon,这是一个新的数据集和任务,专门设计用于确保EHR中结构化表格和非结构化笔记之间的数据一致性。EHRCon是与医疗专业人士合作使用MIMIC-III EHR数据集精心制作的,包括对105份临床笔记进行手动标注,以检查其与数据库条目的一致性,共涉及3,943个实体。EHRCon有两个版本,一个使用原始的MIMIC-III模式,另一个使用OMOP CDM模式,以增加其适用性和泛化能力。此外,利用大型语言模型的能力,我们引入了CheckEHR,这是一个用于验证临床笔记和数据库表格一致性的新框架。CheckEHR利用八个阶段的过程,在少样本学习和零样本学习环境中展现出有希望的结果。代码可在https://github.com/dustn1259/EHRCon 获取。
English
Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system designs and human errors, posing serious risks to patient safety. To address this, we developed EHRCon, a new dataset and task specifically designed to ensure data consistency between structured tables and unstructured notes in EHRs. EHRCon was crafted in collaboration with healthcare professionals using the MIMIC-III EHR dataset, and includes manual annotations of 3,943 entities across 105 clinical notes checked against database entries for consistency. EHRCon has two versions, one using the original MIMIC-III schema, and another using the OMOP CDM schema, in order to increase its applicability and generalizability. Furthermore, leveraging the capabilities of large language models, we introduce CheckEHR, a novel framework for verifying the consistency between clinical notes and database tables. CheckEHR utilizes an eight-stage process and shows promising results in both few-shot and zero-shot settings. The code is available at https://github.com/dustn1259/EHRCon.

Summary

AI-Generated Summary

PDF137November 29, 2024