生成式人工智能的悖论：“它能创造的，它可能不理解”。

摘要

最近一波生成式人工智能引起了前所未有的全球关注，人们既兴奋又担忧，担心可能出现超越人类专家水平的人工智能：现在的模型只需几秒钟就能产生挑战甚至超越专家人类能力的输出。与此同时，模型仍然显示出基本的理解错误，即使是非专家人类也不会犯这种错误。这给我们带来了一个明显的悖论：我们如何调和看似超人类能力与少数人类会犯的错误之间的矛盾？在这项工作中，我们认为这种紧张反映了当今生成式模型中智能配置与人类智能之间的分歧。具体而言，我们提出并测试了生成式人工智能悖论假设：生成模型通过直接训练以复制类似专家的输出，获得了不依赖于其理解这些类型输出的生成能力，因此可以超越其理解这些输出的能力。这与人类形成对比，人类基本的理解几乎总是在能够生成专家级输出之前。我们通过对生成式模型在语言和图像模态下的生成与理解进行对照实验来测试这一假设。我们的结果显示，尽管模型在生成方面可以胜过人类，但在理解能力方面始终不及人类，同时在生成和理解表现之间的相关性较弱，对对抗性输入更脆弱。我们的发现支持了模型的生成能力可能不依赖于理解能力的假设，并呼吁在将人工智能类比为人类智能时保持谨慎。

English

The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. At the same time, models still show basic errors in understanding that would not be expected even in non-expert humans. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make? In this work, we posit that this tension reflects a divergence in the configuration of intelligence in today's generative models relative to intelligence in humans. Specifically, we propose and test the Generative AI Paradox hypothesis: generative models, having been trained directly to reproduce expert-like outputs, acquire generative capabilities that are not contingent upon -- and can therefore exceed -- their ability to understand those same types of outputs. This contrasts with humans, for whom basic understanding almost always precedes the ability to generate expert-level outputs. We test this hypothesis through controlled experiments analyzing generation vs. understanding in generative models, across both language and image modalities. Our results show that although models can outperform humans in generation, they consistently fall short of human capabilities in measures of understanding, as well as weaker correlation between generation and understanding performance, and more brittleness to adversarial inputs. Our findings support the hypothesis that models' generative capability may not be contingent upon understanding capability, and call for caution in interpreting artificial intelligence by analogy to human intelligence.

生成式人工智能的悖论：“它能创造的，它可能不理解”。

The Generative AI Paradox: "What It Can Create, It May Not Understand"

摘要

Support