見過ごされ、見落とされる：CheckboxQAによる大規模言語モデルのチェックボックス盲点への対応

要旨

チェックボックスは、現実世界の文書処理において極めて重要であり、チェックの有無がデータ抽出や意思決定プロセスに直接影響を及ぼします。しかし、大規模な視覚と言語モデルが幅広いタスクで高い性能を発揮しているにもかかわらず、チェック可能なコンテンツの解釈には苦戦しています。この課題は、単一のチェックボックスを見落とすことが高額な規制や契約上の見落としにつながる可能性のある業界において特に深刻です。このギャップを埋めるため、私たちはCheckboxQAデータセットを導入しました。これは、チェックボックス関連タスクにおけるモデルの性能を評価し向上させるために設計されたターゲットリソースです。このデータセットは、現在のモデルの限界を明らかにし、文書理解システムの進歩に貢献する貴重なツールとして機能し、法務テックや金融などのセクターにおける応用に重要な意味を持ちます。データセットは以下のURLで公開されています： https://github.com/Snowflake-Labs/CheckboxQA

English

Checkboxes are critical in real-world document processing where the presence or absence of ticks directly informs data extraction and decision-making processes. Yet, despite the strong performance of Large Vision and Language Models across a wide range of tasks, they struggle with interpreting checkable content. This challenge becomes particularly pressing in industries where a single overlooked checkbox may lead to costly regulatory or contractual oversights. To address this gap, we introduce the CheckboxQA dataset, a targeted resource designed to evaluate and improve model performance on checkbox-related tasks. It reveals the limitations of current models and serves as a valuable tool for advancing document comprehension systems, with significant implications for applications in sectors such as legal tech and finance. The dataset is publicly available at: https://github.com/Snowflake-Labs/CheckboxQA