ChatPaper.aiChatPaper

Kvasir-VQA:一种文本-图像配对的胃肠道数据集

Kvasir-VQA: A Text-Image Pair GI Tract Dataset

September 2, 2024
作者: Sushant Gautam, Andrea Storås, Cise Midoglu, Steven A. Hicks, Vajira Thambawita, Pål Halvorsen, Michael A. Riegler
cs.AI

摘要

我们介绍了Kvasir-VQA,这是从HyperKvasir和Kvasir-Instrument数据集衍生出的扩展数据集,增加了问题-答案注释,以促进在胃肠(GI)诊断中进行高级机器学习任务。该数据集包括6,500张带有注释的图像,涵盖了各种GI道路病变和外科器械,并支持包括是/否、选择、位置和数字计数在内的多种问题类型。该数据集旨在用于诸如图像字幕、视觉问答(VQA)、基于文本生成合成医学图像、目标检测和分类等应用。我们的实验展示了该数据集在三个选定任务的模型训练中的有效性,展示了在医学图像分析和诊断中的重要应用。我们还为每个任务提供了评估指标,突出了我们数据集的可用性和多功能性。数据集和支持工件可在https://datasets.simula.no/kvasir-vqa 获取。
English
We introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question-and-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and it supports multiple question types including yes/no, choice, location, and numerical count. The dataset is intended for applications such as image captioning, Visual Question Answering (VQA), text-based generation of synthetic medical images, object detection, and classification. Our experiments demonstrate the dataset's effectiveness in training models for three selected tasks, showcasing significant applications in medical image analysis and diagnostics. We also present evaluation metrics for each task, highlighting the usability and versatility of our dataset. The dataset and supporting artifacts are available at https://datasets.simula.no/kvasir-vqa.

Summary

AI-Generated Summary

PDF722November 16, 2024