機械の実践的思考：大規模言語モデルにおける実践的コンピテンスの出現を追う

要旨

現在の大規模言語モデル（LLM）は、含意の解釈（Sravanthi et al., 2024）や心の理論推論（Shapira et al., 2024）など、社会的知能タスクにおいて新たな能力を示しており、これらはいずれも相当な語用論的理解を必要とする。しかし、LLMがこの能力をどのように訓練プロセスを通じて獲得するかについては、まだ十分に理解されていない。本研究では、代替案という語用論的概念に基づいたデータセットALTPRAGを導入し、異なる訓練段階にあるLLMが微妙な話者の意図を正確に推論できるかどうかを評価する。各インスタンスは、文脈的に適切だが語用論的に異なる2つの続き文をペアにしており、語用論的解釈と対照的推論の両方を詳細に評価できる。我々は、事前学習、教師あり微調整（SFT）、選好最適化という主要な訓練段階において22のLLMを体系的に評価し、語用論的能力の発達を検証した。その結果、ベースモデルでさえ語用論的手がかりに対する顕著な感度を示し、モデルとデータの規模が増すにつれて一貫して向上することが明らかになった。さらに、SFTとRLHFは、特に認知語用論的推論においてさらなる向上をもたらすことが分かった。これらの知見は、語用論的能力がLLM訓練における創発的かつ合成的な特性であることを強調し、モデルを人間のコミュニケーション規範に適合させるための新たな洞察を提供する。

English

Current large language models (LLMs) have demonstrated emerging capabilities in social intelligence tasks, including implicature resolution (Sravanthi et al. (2024)) and theory-of-mind reasoning (Shapira et al. (2024)), both of which require substantial pragmatic understanding. However, how LLMs acquire this competence throughout the training process remains poorly understood. In this work, we introduce ALTPRAG, a dataset grounded in the pragmatic concept of alternatives, designed to evaluate whether LLMs at different training stages can accurately infer nuanced speaker intentions. Each instance pairs two contextually appropriate but pragmatically distinct continuations, enabling fine-grained assessment of both pragmatic interpretation and contrastive reasoning. We systematically evaluate 22 LLMs across key training stages: pre-training, supervised fine-tuning (SFT), and preference optimization, to examine the development of pragmatic competence. Our results show that even base models exhibit notable sensitivity to pragmatic cues, which improves consistently with increases in model and data scale. Additionally, SFT and RLHF contribute further gains, particularly in cognitive-pragmatic reasoning. These findings highlight pragmatic competence as an emergent and compositional property of LLM training and offer new insights for aligning models with human communicative norms.

機械の実践的思考：大規模言語モデルにおける実践的コンピテンスの出現を追う

The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models

要旨

Support