從語言模型生成新穎的實驗性假設：跨及物動詞的一個案例研究

摘要

神經網絡語言模型（LMs）已被證明成功捕捉複雜的語言知識。然而，它們在理解語言習得方面的效用仍存在爭議。我們通過提出一個案例研究來參與這場辯論，我們使用LMs作為模擬學習者，提出新的實驗假設，以供人類進行測試。我們應用這個範式來研究交互賓格概括（CDG）：對於新動詞在交互賓格結構中的生產性概括（她把球扔給我/她把球扔給我）-- 其習得已知牽涉到大量的語境特徵空間 -- 使用在兒童對話中訓練的LMs。我們具體提出問題："訓練曝光的特性如何促進新動詞對（未建模的）替代結構的概括？" 為了回答這個問題，我們系統地變化新交互賓格動詞出現的曝光語境，涉及主題和接受者的特性，然後分析LMs在未建模的交互賓格結構中使用新動詞的情況。我們發現LMs複製了兒童CDG的已知模式，作為探索新假設的先決條件。隨後的模擬揭示了新動詞曝光語境特性對LMs的CDG的微妙作用。我們發現當曝光語境的第一個動詞後論元是代詞、確定詞、簡短的，並符合曝光交互賓格的典型生命力期望時，CDG會得到促進。這些模式是交互賓格中的和諧對齊的特徵，其中具有較高談話突出性等級的特徵論元傾向於在其他論元之前出現。這產生了一個新的假設，即CDG在曝光語境的特性 -- 特別是其第一個動詞後論元 -- 和諧對齊的程度越高，CDG就越容易。我們最後提出未來可以在兒童中測試這個假設的實驗。

English

Neural network language models (LMs) have been shown to successfully capture complex linguistic knowledge. However, their utility for understanding language acquisition is still debated. We contribute to this debate by presenting a case study where we use LMs as simulated learners to derive novel experimental hypotheses to be tested with humans. We apply this paradigm to study cross-dative generalization (CDG): productive generalization of novel verbs across dative constructions (she pilked me the ball/she pilked the ball to me) -- acquisition of which is known to involve a large space of contextual features -- using LMs trained on child-directed speech. We specifically ask: "what properties of the training exposure facilitate a novel verb's generalization to the (unmodeled) alternate construction?" To answer this, we systematically vary the exposure context in which a novel dative verb occurs in terms of the properties of the theme and recipient, and then analyze the LMs' usage of the novel verb in the unmodeled dative construction. We find LMs to replicate known patterns of children's CDG, as a precondition to exploring novel hypotheses. Subsequent simulations reveal a nuanced role of the features of the novel verbs' exposure context on the LMs' CDG. We find CDG to be facilitated when the first postverbal argument of the exposure context is pronominal, definite, short, and conforms to the prototypical animacy expectations of the exposure dative. These patterns are characteristic of harmonic alignment in datives, where the argument with features ranking higher on the discourse prominence scale tends to precede the other. This gives rise to a novel hypothesis that CDG is facilitated insofar as the features of the exposure context -- in particular, its first postverbal argument -- are harmonically aligned. We conclude by proposing future experiments that can test this hypothesis in children.

從語言模型生成新穎的實驗性假設：跨及物動詞的一個案例研究

Generating novel experimental hypotheses from language models: A case study on cross-dative generalization

摘要

Support