自动编码医学信函的可解释性比较研究
A Comparative Study on Automatic Coding of Medical Letters with Explainability
July 18, 2024
作者: Jamie Glen, Lifeng Han, Paul Rayson, Goran Nenadic
cs.AI
摘要
本研究旨在探索自然语言处理(NLP)和机器学习(ML)技术在医学信函编码自动化方面的应用,实现可视化解释性和轻量级本地计算机设置。目前在临床环境中,编码是一个手动过程,涉及为患者文件中的每种病症、程序和药物分配代码(例如,使用SNOMED CT代码的56265001心脏病)。在这一领域已经有关于使用最先进的ML模型进行自动编码的初步研究;然而,由于模型的复杂性和规模,尚未实现在现实世界中的部署。为进一步促进自动编码实践的可能性,我们在本地计算机设置中探索了一些解决方案;此外,我们探讨了解释性功能以透明化AI模型。我们使用了公开可用的MIMIC-III数据库和HAN/HLAN网络模型进行ICD代码预测。我们还尝试了ICD和SNOMED CT知识库之间的映射。在我们的实验中,模型为97.98%的代码提供了有用信息。这项研究的结果可以为在实践中实现自动临床编码提供一些启示,例如在医院环境中,临床医生使用的本地计算机上,项目页面https://github.com/Glenj01/Medical-Coding。
English
This study aims to explore the implementation of Natural Language Processing
(NLP) and machine learning (ML) techniques to automate the coding of medical
letters with visualised explainability and light-weighted local computer
settings. Currently in clinical settings, coding is a manual process that
involves assigning codes to each condition, procedure, and medication in a
patient's paperwork (e.g., 56265001 heart disease using SNOMED CT code). There
are preliminary research on automatic coding in this field using
state-of-the-art ML models; however, due to the complexity and size of the
models, the real-world deployment is not achieved. To further facilitate the
possibility of automatic coding practice, we explore some solutions in a local
computer setting; in addition, we explore the function of explainability for
transparency of AI models. We used the publicly available MIMIC-III database
and the HAN/HLAN network models for ICD code prediction purposes. We also
experimented with the mapping between ICD and SNOMED CT knowledge bases. In our
experiments, the models provided useful information for 97.98\% of codes. The
result of this investigation can shed some light on implementing automatic
clinical coding in practice, such as in hospital settings, on the local
computers used by clinicians , project page
https://github.com/Glenj01/Medical-Coding.Summary
AI-Generated Summary