ChatPaper.aiChatPaper

罗马尼亚数学推理基准测试RoMath

RoMath: A Mathematical Reasoning Benchmark in Romanian

September 17, 2024
作者: Adrian Cosma, Ana-Maria Bucur, Emilian Radoi
cs.AI

摘要

数学长期以来主要通过自然语言传达,以便人类理解。随着机械化数学和证明助手的兴起,人们越来越需要理解非正式的数学文本,然而大多数现有基准测试仅关注英语,忽视了其他语言。本文介绍了RoMath,一个罗马尼亚数学推理基准套件,包括三个数据集:RoMath-文凭、RoMath-竞赛和RoMath-合成,涵盖了各种数学领域和难度级别,旨在改进非英语语言模型并促进多语言人工智能发展。通过专注于罗马尼亚语,一种资源稀缺且具有独特语言特征的语言,RoMath解决了以英语为中心的模型的局限性,并强调了超越简单自动翻译的需求。我们对几个开放权重语言模型进行基准测试,突出了为代表性不足的语言创建资源的重要性。我们提供代码和数据集。
English
Mathematics has long been conveyed through natural language, primarily for human understanding. With the rise of mechanized mathematics and proof assistants, there is a growing need to understand informal mathematical text, yet most existing benchmarks focus solely on English, overlooking other languages. This paper introduces RoMath, a Romanian mathematical reasoning benchmark suite comprising three datasets: RoMath-Baccalaureate, RoMath-Competitions and RoMath-Synthetic, which cover a range of mathematical domains and difficulty levels, aiming to improve non-English language models and promote multilingual AI development. By focusing on Romanian, a low-resource language with unique linguistic features, RoMath addresses the limitations of Anglo-centric models and emphasizes the need for dedicated resources beyond simple automatic translation. We benchmark several open-weight language models, highlighting the importance of creating resources for underrepresented languages. We make the code and dataset available.

Summary

AI-Generated Summary

PDF32November 16, 2024