微缩版Aya模型：跨越规模与多语言深度的桥梁

摘要

Tiny Aya重新定义了小型多语言模型的潜力。该模型基于70种语言进行训练，并通过区域感知后训练优化，仅以35亿参数就实现了顶尖的翻译质量、强大的多语言理解能力以及高质量的目标语言生成效果。本次发布包含预训练基础模型、全球平衡的指令微调版本，以及针对非洲、南亚、欧洲、亚太和西亚语言的三个区域专项模型。本报告详述了Tiny Aya的训练策略、数据构成与综合评估框架，为多语言AI发展提供了新的扩展路径——这条路径以效率为核心，追求跨语言平衡性能与实际部署可行性。

English

Tiny Aya redefines what a small multilingual language model can achieve. Trained on 70 languages and refined through region-aware posttraining, it delivers state-of-the-art in translation quality, strong multilingual understanding, and high-quality target-language generation, all with just 3.35B parameters. The release includes a pretrained foundation model, a globally balanced instruction-tuned variant, and three region-specialized models targeting languages from Africa, South Asia, Europe, Asia-Pacific, and West Asia. This report details the training strategy, data composition, and comprehensive evaluation framework behind Tiny Aya, and presents an alternative scaling path for multilingual AI: one centered on efficiency, balanced performance across languages, and practical deployment.

微缩版Aya模型：跨越规模与多语言深度的桥梁

Tiny Aya: Bridging Scale and Multilingual Depth

摘要

Support