罗马尼亚数学推理基准测试RoMath
RoMath: A Mathematical Reasoning Benchmark in Romanian
September 17, 2024
作者: Adrian Cosma, Ana-Maria Bucur, Emilian Radoi
cs.AI
摘要
数学长期以来主要通过自然语言传达,以便人类理解。随着机械化数学和证明助手的兴起,人们越来越需要理解非正式的数学文本,然而大多数现有基准测试仅关注英语,忽视了其他语言。本文介绍了RoMath,一个罗马尼亚数学推理基准套件,包括三个数据集:RoMath-文凭、RoMath-竞赛和RoMath-合成,涵盖了各种数学领域和难度级别,旨在改进非英语语言模型并促进多语言人工智能发展。通过专注于罗马尼亚语,一种资源稀缺且具有独特语言特征的语言,RoMath解决了以英语为中心的模型的局限性,并强调了超越简单自动翻译的需求。我们对几个开放权重语言模型进行基准测试,突出了为代表性不足的语言创建资源的重要性。我们提供代码和数据集。
English
Mathematics has long been conveyed through natural language, primarily for
human understanding. With the rise of mechanized mathematics and proof
assistants, there is a growing need to understand informal mathematical text,
yet most existing benchmarks focus solely on English, overlooking other
languages. This paper introduces RoMath, a Romanian mathematical reasoning
benchmark suite comprising three datasets: RoMath-Baccalaureate,
RoMath-Competitions and RoMath-Synthetic, which cover a range of mathematical
domains and difficulty levels, aiming to improve non-English language models
and promote multilingual AI development. By focusing on Romanian, a
low-resource language with unique linguistic features, RoMath addresses the
limitations of Anglo-centric models and emphasizes the need for dedicated
resources beyond simple automatic translation. We benchmark several open-weight
language models, highlighting the importance of creating resources for
underrepresented languages. We make the code and dataset available.Summary
AI-Generated Summary