ChatPaper.aiChatPaper

突破界限:探讨模型编辑对跨语言性能的影响

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

June 17, 2024
作者: Somnath Banerjee, Avik Halder, Rajarshi Mandal, Sayan Layek, Ian Soboroff, Rima Hazra, Animesh Mukherjee
cs.AI

摘要

预训练语言模型(PLMs)如BERT和GPT的整合已经彻底改变了自然语言处理(NLP),尤其是对于英语,但也带来了语言上的不平衡。本文通过在多语境中检验几种知识编辑技术,战略性地确定了对语言平等的需求。我们评估了Mistral、TowerInstruct、OpenHathi、Tamil-Llama和Kan-Llama等模型在包括英语、德语、法语、意大利语、西班牙语、印地语、泰米尔语和卡纳达语在内的多种语言上的性能。我们的研究发现了关于跨语言一致性的正常模型和合并模型之间的显著差异。我们采用“每种语言为自己”(ELFI)和“每种语言为他人”(ELFO)等策略来对这些模型进行压力测试。我们的发现展示了LLMs克服语言障碍的潜力,为实现AI技术中的语言包容性奠定了基础。
English
The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies.

Summary

AI-Generated Summary

PDF131December 3, 2024