ChatPaper.aiChatPaper

RaTEScore:一個用於放射學報告生成的指標

RaTEScore: A Metric for Radiology Report Generation

June 24, 2024
作者: Weike Zhao, Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
cs.AI

摘要

本文介紹了一種新穎的實體感知指標,稱為放射學報告(文本)評估(RaTEScore),用於評估由人工智慧模型生成的醫學報告的質量。RaTEScore強調關鍵的醫學實體,如診斷結果和解剖細節,並且對於複雜的醫學同義詞具有韌性,對否定表達敏感。從技術上講,我們開發了一個全面的醫學實體識別(NER)數據集RaTE-NER,並專門為此目的訓練了一個NER模型。該模型能夠將複雜的放射學報告分解為組成的醫學實體。該指標本身是通過比較從語言模型獲得的實體嵌入的相似性來衍生的,基於它們的類型和與臨床重要性的相關性。我們的評估表明,RaTEScore與現有指標更接近人類偏好,並在已建立的公共基準測試和我們新提出的RaTE-Eval基準測試上得到驗證。
English
This paper introduces a novel, entity-aware metric, termed as Radiological Report (Text) Evaluation (RaTEScore), to assess the quality of medical reports generated by AI models. RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions. Technically, we developed a comprehensive medical NER dataset, RaTE-NER, and trained an NER model specifically for this purpose. This model enables the decomposition of complex radiological reports into constituent medical entities. The metric itself is derived by comparing the similarity of entity embeddings, obtained from a language model, based on their types and relevance to clinical significance. Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.

Summary

AI-Generated Summary

PDF51November 29, 2024