通過試錯法評估智能
Evaluating Intelligence via Trial and Error
February 26, 2025
作者: Jingtao Zhan, Jiahao Zhao, Jiayu Li, Yiqun Liu, Bo Zhang, Qingyao Ai, Jiaxin Mao, Hongning Wang, Min Zhang, Shaoping Ma
cs.AI
摘要
智慧是物種在有限次數的試錯過程中找到解決方案的關鍵特質。基於這一理念,我們引入了「生存遊戲」作為一個框架,通過試錯過程中的失敗次數來評估智慧水平。失敗次數越少,表明智慧越高。當失敗次數的期望值和方差均為有限值時,這標誌著能夠持續找到應對新挑戰的解決方案,我們將其定義為「自主智慧水平」。利用「生存遊戲」,我們全面評估了現有的AI系統。結果顯示,雖然AI系統在簡單任務中達到了自主智慧水平,但在更複雜的任務中,如視覺、搜索、推薦和語言處理,它們仍遠未達到這一水平。儘管擴展現有的AI技術可能有所幫助,但這將帶來天文數字的成本。預測表明,實現通用任務的自主智慧水平將需要10^{26}個參數。為了更直觀地理解這一規模,加載如此龐大的模型所需的H100 GPU總價值將是蘋果公司市值的10^{7}倍。即使按照摩爾定律,支持這樣的參數規模也需要70年。這一驚人的成本凸顯了人類任務的複雜性以及當前AI技術的不足。為了進一步探究這一現象,我們對「生存遊戲」及其實驗結果進行了理論分析。我們的研究表明,人類任務具有臨界性特質。因此,達到自主智慧水平需要深入理解任務的底層機制。然而,當前的AI系統並未完全掌握這些機制,而是依賴於表面的模仿,這使得它們難以達到自主水平。我們相信,「生存遊戲」不僅能指導AI的未來發展,還能為人類智慧提供深刻的洞見。
English
Intelligence is a crucial trait for species to find solutions within a
limited number of trial-and-error attempts. Building on this idea, we introduce
Survival Game as a framework to evaluate intelligence based on the number of
failed attempts in a trial-and-error process. Fewer failures indicate higher
intelligence. When the expectation and variance of failure counts are both
finite, it signals the ability to consistently find solutions to new
challenges, which we define as the Autonomous Level of intelligence. Using
Survival Game, we comprehensively evaluate existing AI systems. Our results
show that while AI systems achieve the Autonomous Level in simple tasks, they
are still far from it in more complex tasks, such as vision, search,
recommendation, and language. While scaling current AI technologies might help,
this would come at an astronomical cost. Projections suggest that achieving the
Autonomous Level for general tasks would require 10^{26} parameters. To put
this into perspective, loading such a massive model requires so many H100 GPUs
that their total value is 10^{7} times that of Apple Inc.'s market value.
Even with Moore's Law, supporting such a parameter scale would take 70 years.
This staggering cost highlights the complexity of human tasks and the
inadequacies of current AI technologies. To further investigate this
phenomenon, we conduct a theoretical analysis of Survival Game and its
experimental results. Our findings suggest that human tasks possess a
criticality property. As a result, Autonomous Level requires a deep
understanding of the task's underlying mechanisms. Current AI systems, however,
do not fully grasp these mechanisms and instead rely on superficial mimicry,
making it difficult for them to reach an autonomous level. We believe Survival
Game can not only guide the future development of AI but also offer profound
insights into human intelligence.Summary
AI-Generated Summary