ChatPaper.aiChatPaper

論雙語語言模型中共享語法表徵的習得

On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

March 5, 2025
作者: Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen
cs.AI

摘要

雖然跨語言遷移對於當代語言模型的多語言能力至關重要,但其具體發生機制尚未得到充分理解。在本文中,我們探討了當單語言模型開始接受第二語言訓練時會發生什麼。具體而言,我們訓練了小型雙語模型,並控制了每種語言的數據量以及語言接觸的順序。為了尋找共享多語言表徵的證據,我們採用了結構啟動(structural priming)這一用於研究人類語法表徵的方法。我們首先複製了先前的跨語言結構啟動結果,並發現,在控制了訓練數據量和語言接觸後,不同語言對及其方向之間存在不對稱效應。我們認為,這種不對稱性可能為人類結構啟動效應的假設提供啟示。此外,我們還發現,對於相似度較低的語言對,結構啟動效應的穩健性較弱,這凸顯了跨語言遷移學習和共享表徵在處理類型學上多樣化語言時的潛在局限性。
English
While crosslingual transfer is crucial to contemporary language models' multilingual capabilities, how it occurs is not well understood. In this paper, we ask what happens to a monolingual language model when it begins to be trained on a second language. Specifically, we train small bilingual models for which we control the amount of data for each language and the order of language exposure. To find evidence of shared multilingual representations, we turn to structural priming, a method used to study grammatical representations in humans. We first replicate previous crosslingual structural priming results and find that after controlling for training data quantity and language exposure, there are asymmetrical effects across language pairs and directions. We argue that this asymmetry may shape hypotheses about human structural priming effects. We also find that structural priming effects are less robust for less similar language pairs, highlighting potential limitations of crosslingual transfer learning and shared representations for typologically diverse languages.
PDF41March 7, 2025