从语言模型生成新颖的实验性假设：跨二句宾语概括的案例研究

摘要

神经网络语言模型（LMs）已被证明成功捕捉复杂的语言知识。然而，它们在理解语言习得方面的实用性仍存在争议。我们通过提出一个案例研究来参与这一争论，案例中我们使用LMs作为模拟学习者，提出新颖的实验假设，以便与人类进行测试。我们将这一范式应用于研究交互给事关系的一般化（CDG）：即在给事结构中对新动词的生产性泛化（她把球扔给我/她把球扔给我）-- 其中已知涉及大量上下文特征空间的习得。我们特别探讨：“训练曝光的哪些特性有助于新动词泛化到（未建模的）替代结构？”为了回答这个问题，我们系统地变化了新的给事动词出现的曝光上下文的特性，涉及主题和接受者的属性，然后分析LMs在未建模的给事结构中对新动词的使用。我们发现LMs重现了儿童CDG的已知模式，作为探索新假设的先决条件。随后的模拟揭示了新动词曝光上下文特性对LMs的CDG的微妙作用。我们发现当曝光上下文的第一个动词后论元是代词，明确的，简短的，并符合曝光给事的典型生物性期望时，CDG会得到促进。这些模式是给事中的和谐对齐的特征，其中在话语突出度量表上排名较高的论元倾向于在另一个之前出现。这产生了一个新的假设，即只要曝光上下文的特性 -- 特别是其第一个动词后论元 -- 是和谐对齐的，CDG就会得到促进。我们最后提出未来可以在儿童中测试这一假设的实验。

English

Neural network language models (LMs) have been shown to successfully capture complex linguistic knowledge. However, their utility for understanding language acquisition is still debated. We contribute to this debate by presenting a case study where we use LMs as simulated learners to derive novel experimental hypotheses to be tested with humans. We apply this paradigm to study cross-dative generalization (CDG): productive generalization of novel verbs across dative constructions (she pilked me the ball/she pilked the ball to me) -- acquisition of which is known to involve a large space of contextual features -- using LMs trained on child-directed speech. We specifically ask: "what properties of the training exposure facilitate a novel verb's generalization to the (unmodeled) alternate construction?" To answer this, we systematically vary the exposure context in which a novel dative verb occurs in terms of the properties of the theme and recipient, and then analyze the LMs' usage of the novel verb in the unmodeled dative construction. We find LMs to replicate known patterns of children's CDG, as a precondition to exploring novel hypotheses. Subsequent simulations reveal a nuanced role of the features of the novel verbs' exposure context on the LMs' CDG. We find CDG to be facilitated when the first postverbal argument of the exposure context is pronominal, definite, short, and conforms to the prototypical animacy expectations of the exposure dative. These patterns are characteristic of harmonic alignment in datives, where the argument with features ranking higher on the discourse prominence scale tends to precede the other. This gives rise to a novel hypothesis that CDG is facilitated insofar as the features of the exposure context -- in particular, its first postverbal argument -- are harmonically aligned. We conclude by proposing future experiments that can test this hypothesis in children.

从语言模型生成新颖的实验性假设：跨二句宾语概括的案例研究

Generating novel experimental hypotheses from language models: A case study on cross-dative generalization

摘要

Support