生成AIのパラドックス：「それが創造できるものは、理解できないかもしれない」

要旨

近年の生成AIの波は、前例のない世界的な注目を集めており、超人的なレベルの人工知能に対する期待と懸念が高まっています。現在のモデルは、専門家の能力を凌駕するような出力をわずか数秒で生成することができます。一方で、これらのモデルは、非専門家であっても犯さないような基本的な理解の誤りを示すことがあります。これは一見矛盾する現象を提示しています：超人的な能力と、人間ならばまず犯さないような誤りの持続性を、どのように調和させればよいのでしょうか？本研究では、この緊張関係は、現代の生成モデルにおける知能の構成が、人間の知能とは異なる方向に進化していることを反映していると仮定します。具体的には、「生成AIのパラドックス仮説」を提案し、検証します：生成モデルは、専門家のような出力を直接再現するように訓練されることで、その理解能力に依存しない（したがってそれを超える）生成能力を獲得するという仮説です。これは、基本的な理解が専門レベルの出力能力に先行する人間とは対照的です。この仮説を検証するため、言語と画像の両モダリティにおいて、生成モデルの生成能力と理解能力を分析する制御実験を行いました。その結果、モデルは生成において人間を上回るものの、理解能力の測定では一貫して人間に及ばず、生成と理解の性能の相関が弱く、敵対的入力に対する脆弱性が高いことが示されました。これらの発見は、モデルの生成能力が理解能力に依存しない可能性を示唆しており、人間の知能との類推によって人工知能を解釈することに注意を喚起するものです。

English

The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. At the same time, models still show basic errors in understanding that would not be expected even in non-expert humans. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make? In this work, we posit that this tension reflects a divergence in the configuration of intelligence in today's generative models relative to intelligence in humans. Specifically, we propose and test the Generative AI Paradox hypothesis: generative models, having been trained directly to reproduce expert-like outputs, acquire generative capabilities that are not contingent upon -- and can therefore exceed -- their ability to understand those same types of outputs. This contrasts with humans, for whom basic understanding almost always precedes the ability to generate expert-level outputs. We test this hypothesis through controlled experiments analyzing generation vs. understanding in generative models, across both language and image modalities. Our results show that although models can outperform humans in generation, they consistently fall short of human capabilities in measures of understanding, as well as weaker correlation between generation and understanding performance, and more brittleness to adversarial inputs. Our findings support the hypothesis that models' generative capability may not be contingent upon understanding capability, and call for caution in interpreting artificial intelligence by analogy to human intelligence.

生成AIのパラドックス：「それが創造できるものは、理解できないかもしれない」

The Generative AI Paradox: "What It Can Create, It May Not Understand"

要旨

Support