생성형 AI의 역설: "그것이 창조할 수 있는 것, 그것이 이해하지 못할 수도 있는 것"

초록

최근 생성형 AI의 물결은 전례 없는 전 세계적 관심을 불러일으키며, 초인적인 수준의 인공지능에 대한 기대와 우려를 동시에 자아내고 있습니다. 현재의 모델들은 전문가 수준의 인간조차도 도전하거나 능가할 만한 결과물을 단 몇 초 만에 생성할 수 있습니다. 동시에, 이러한 모델들은 비전문가 수준의 인간에게도 기대되지 않는 기본적인 이해 오류를 여전히 보이고 있습니다. 이는 명백한 역설을 제시합니다: 어떻게 초인적인 능력과 인간이라면 거의 저지르지 않을 오류의 지속을 조화시킬 수 있을까요? 본 연구에서는 이러한 긴장이 오늘날의 생성형 모델과 인간의 지능 구성 간의 차이에서 비롯된다고 주장합니다. 구체적으로, 우리는 생성형 AI 역설 가설을 제안하고 검증합니다: 생성형 모델들은 전문가 수준의 출력을 직접 재현하도록 훈련됨으로써, 동일한 유형의 출력을 이해하는 능력에 의존하지 않고도 이를 초월할 수 있는 생성 능력을 획득합니다. 이는 기본적인 이해가 전문가 수준의 출력 생성 능력을 거의 항상 선행하는 인간과 대조됩니다. 우리는 이 가설을 언어와 이미지 양쪽 모달리티에 걸쳐 생성형 모델의 생성 대 이해를 분석하는 통제된 실험을 통해 검증합니다. 실험 결과, 모델들은 생성 작업에서는 인간을 능가할 수 있지만, 이해 능력 측면에서는 일관되게 인간에 미치지 못하며, 생성과 이해 성능 간의 상관관계가 더 약하고, 적대적 입력에 더 취약한 것으로 나타났습니다. 이러한 결과는 모델의 생성 능력이 이해 능력에 의존하지 않을 수 있다는 가설을 지지하며, 인간 지능에 비유하여 인공지능을 해석하는 데 있어 신중을 기할 것을 요구합니다.

English

The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. At the same time, models still show basic errors in understanding that would not be expected even in non-expert humans. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make? In this work, we posit that this tension reflects a divergence in the configuration of intelligence in today's generative models relative to intelligence in humans. Specifically, we propose and test the Generative AI Paradox hypothesis: generative models, having been trained directly to reproduce expert-like outputs, acquire generative capabilities that are not contingent upon -- and can therefore exceed -- their ability to understand those same types of outputs. This contrasts with humans, for whom basic understanding almost always precedes the ability to generate expert-level outputs. We test this hypothesis through controlled experiments analyzing generation vs. understanding in generative models, across both language and image modalities. Our results show that although models can outperform humans in generation, they consistently fall short of human capabilities in measures of understanding, as well as weaker correlation between generation and understanding performance, and more brittleness to adversarial inputs. Our findings support the hypothesis that models' generative capability may not be contingent upon understanding capability, and call for caution in interpreting artificial intelligence by analogy to human intelligence.

생성형 AI의 역설: "그것이 창조할 수 있는 것, 그것이 이해하지 못할 수도 있는 것"

The Generative AI Paradox: "What It Can Create, It May Not Understand"

초록

Support