生成式人工智慧的悖論:「它所創造的,可能並不完全理解」
The Generative AI Paradox: "What It Can Create, It May Not Understand"
October 31, 2023
作者: Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, Yejin Choi
cs.AI
摘要
近來生成式人工智慧的浪潮引起了前所未有的全球關注,人們對於潛在超越專家人類能力的人工智慧水平感到興奮和擔憂:現在的模型只需幾秒鐘即可產生挑戰甚至超越專家人類能力的輸出。同時,這些模型仍然展示了基本的理解錯誤,這是即使在非專家人類身上也不會預期到的。這給我們帶來了一個明顯的悖論:我們如何調和看似超人類能力與少數人類會犯的錯誤之間的矛盾?在這項工作中,我們提出這種緊張關係反映了當今生成式模型中的智能配置與人類智能之間的分歧。具體而言,我們提出並測試生成式人工智慧悖論假說:生成式模型通過直接訓練以重現類似專家的輸出,獲得了不依賴於並且因此可能超越其理解這些類型輸出的能力。這與人類形成對比,對於人類來說,基本理解幾乎總是在能夠生成專家級輸出之前。我們通過對生成式模型在語言和圖像模式下的生成與理解進行對照實驗,來測試這一假說。我們的結果顯示,儘管模型在生成方面可以超越人類,但在理解能力方面始終遠遠不及人類,並且在生成和理解表現之間的相關性較弱,對對抗性輸入更加脆弱。我們的研究支持了模型的生成能力可能不依賴於理解能力的假說,並呼籲在將人工智慧類比於人類智能時要謹慎。
English
The recent wave of generative AI has sparked unprecedented global attention,
with both excitement and concern over potentially superhuman levels of
artificial intelligence: models now take only seconds to produce outputs that
would challenge or exceed the capabilities even of expert humans. At the same
time, models still show basic errors in understanding that would not be
expected even in non-expert humans. This presents us with an apparent paradox:
how do we reconcile seemingly superhuman capabilities with the persistence of
errors that few humans would make? In this work, we posit that this tension
reflects a divergence in the configuration of intelligence in today's
generative models relative to intelligence in humans. Specifically, we propose
and test the Generative AI Paradox hypothesis: generative models, having been
trained directly to reproduce expert-like outputs, acquire generative
capabilities that are not contingent upon -- and can therefore exceed -- their
ability to understand those same types of outputs. This contrasts with humans,
for whom basic understanding almost always precedes the ability to generate
expert-level outputs. We test this hypothesis through controlled experiments
analyzing generation vs. understanding in generative models, across both
language and image modalities. Our results show that although models can
outperform humans in generation, they consistently fall short of human
capabilities in measures of understanding, as well as weaker correlation
between generation and understanding performance, and more brittleness to
adversarial inputs. Our findings support the hypothesis that models' generative
capability may not be contingent upon understanding capability, and call for
caution in interpreting artificial intelligence by analogy to human
intelligence.