微缩版Aya模型:跨越规模与多语言深度的桥梁
Tiny Aya: Bridging Scale and Multilingual Depth
March 12, 2026
作者: Alejandro R. Salamanca, Diana Abagyan, Daniel D'souza, Ammar Khairi, David Mora, Saurabh Dash, Viraat Aryabumi, Sara Rajaee, Mehrnaz Mofakhami, Ananya Sahu, Thomas Euyang, Brittawnya Prince, Madeline Smith, Hangyu Lin, Acyr Locatelli, Sara Hooker, Tom Kocmi, Aidan Gomez, Ivan Zhang, Phil Blunsom, Nick Frosst, Joelle Pineau, Beyza Ermis, Ahmet Üstün, Julia Kreutzer, Marzieh Fadaee
cs.AI
摘要
Tiny Aya重新定义了小型多语言模型的潜力。该模型基于70种语言进行训练,并通过区域感知后训练优化,仅以35亿参数就实现了顶尖的翻译质量、强大的多语言理解能力以及高质量的目标语言生成效果。本次发布包含预训练基础模型、全球平衡的指令微调版本,以及针对非洲、南亚、欧洲、亚太和西亚语言的三个区域专项模型。本报告详述了Tiny Aya的训练策略、数据构成与综合评估框架,为多语言AI发展提供了新的扩展路径——这条路径以效率为核心,追求跨语言平衡性能与实际部署可行性。
English
Tiny Aya redefines what a small multilingual language model can achieve. Trained on 70 languages and refined through region-aware posttraining, it delivers state-of-the-art in translation quality, strong multilingual understanding, and high-quality target-language generation, all with just 3.35B parameters. The release includes a pretrained foundation model, a globally balanced instruction-tuned variant, and three region-specialized models targeting languages from Africa, South Asia, Europe, Asia-Pacific, and West Asia. This report details the training strategy, data composition, and comprehensive evaluation framework behind Tiny Aya, and presents an alternative scaling path for multilingual AI: one centered on efficiency, balanced performance across languages, and practical deployment.