知能内視鏡技術の最前線

要旨

大腸内視鏡検査は現在、大腸がんの最も感度の高い検査方法の一つです。この研究は、知能を持った大腸内視鏡技術の最前線とその多面的医療応用への将来的な影響を調査します。この目標を達成するために、現在のデータ中心およびモデル中心の風景を、大腸内視鏡シーンの知覚に関する4つのタスク（分類、検出、セグメンテーション、ビジョン言語理解）を通じて評価します。この評価により、特定の領域に固有の課題を特定し、大腸内視鏡の多面的研究がさらなる探求の余地があることが明らかになります。将来の多面的時代を受け入れるために、3つの基盤となる取り組みを確立します：大規模な多面的指示調整データセットColonINST、大腸内視鏡用の多面的言語モデルColonGPT、および多面的ベンチマーク。この急速に進化する分野の継続的なモニタリングを促進するために、最新情報のための公開ウェブサイトを提供します：https://github.com/ai4colonoscopy/IntelliScope。

English

Colonoscopy is currently one of the most sensitive screening methods for colorectal cancer. This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications. With this goal, we begin by assessing the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception, including classification, detection, segmentation, and vision-language understanding. This assessment enables us to identify domain-specific challenges and reveals that multimodal research in colonoscopy remains open for further exploration. To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark. To facilitate ongoing monitoring of this rapidly evolving field, we provide a public website for the latest updates: https://github.com/ai4colonoscopy/IntelliScope.

知能内視鏡技術の最前線

Frontiers in Intelligent Colonoscopy

要旨

Support