以韓國教育標準評估多模態生成式人工智慧
Evaluating Multimodal Generative AI with Korean Educational Standards
February 21, 2025
作者: Sanghee Park, Geewook Kim
cs.AI
摘要
本文介紹了韓國國家教育考試基準(KoNET),這是一個旨在利用韓國國家教育考試來評估多模態生成式AI系統的新基準。KoNET包含四項考試:韓國小學普通教育發展測試(KoEGED)、中學(KoMGED)、高中(KoHGED)以及大學修學能力試驗(KoCSAT)。這些考試以其嚴格的標準和多樣化的題目而聞名,有助於全面分析AI在不同教育層級中的表現。透過聚焦於韓語,KoNET為模型在較少被探索語言中的表現提供了洞見。我們評估了一系列模型——開源、開放存取及封閉API——通過考察難度、科目多樣性及人類錯誤率。程式碼與資料集建構工具將於https://github.com/naver-ai/KoNET完全開源。
English
This paper presents the Korean National Educational Test Benchmark (KoNET), a
new benchmark designed to evaluate Multimodal Generative AI Systems using
Korean national educational tests. KoNET comprises four exams: the Korean
Elementary General Educational Development Test (KoEGED), Middle (KoMGED), High
(KoHGED), and College Scholastic Ability Test (KoCSAT). These exams are
renowned for their rigorous standards and diverse questions, facilitating a
comprehensive analysis of AI performance across different educational levels.
By focusing on Korean, KoNET provides insights into model performance in
less-explored languages. We assess a range of models - open-source,
open-access, and closed APIs - by examining difficulties, subject diversity,
and human error rates. The code and dataset builder will be made fully
open-sourced at https://github.com/naver-ai/KoNET.Summary
AI-Generated Summary