ChatPaper.aiChatPaper

我們無法用現有詞彙理解人工智慧

We Can't Understand AI Using our Existing Vocabulary

February 11, 2025
作者: John Hewitt, Robert Geirhos, Been Kim
cs.AI

摘要

這份立場文件主張,為了理解人工智慧,我們不能依賴現有的人類詞彙。相反地,我們應該努力發展新詞彙:代表我們想要教導機器的精確人類概念,或者我們需要學習的機器概念的新詞彙。我們從人類和機器擁有不同概念的前提出發。這意味著可解釋性可以被構想為一個溝通問題:人類必須能夠參照和控制機器概念,並將人類概念傳達給機器。我們相信,透過發展新詞彙來創建共享的人機語言,可以解決這個溝通問題。成功的新詞彙實現了一定程度的抽象化:不要太細節,以至於可以在許多情境中重複使用,也不要太高層次,以至於能夠傳達精確信息。作為概念證明,我們展示了如何透過「長度新詞彙」來控制LLM回應的長度,同時「多樣性新詞彙」則允許抽樣更多變化的回應。綜合來看,我們認為我們無法使用現有的詞彙來理解人工智慧,透過新詞彙的擴展可以為更好地控制和理解機器創造機會。
English
This position paper argues that, in order to understand AI, we cannot rely on our existing vocabulary of human words. Instead, we should strive to develop neologisms: new words that represent precise human concepts that we want to teach machines, or machine concepts that we need to learn. We start from the premise that humans and machines have differing concepts. This means interpretability can be framed as a communication problem: humans must be able to reference and control machine concepts, and communicate human concepts to machines. Creating a shared human-machine language through developing neologisms, we believe, could solve this communication problem. Successful neologisms achieve a useful amount of abstraction: not too detailed, so they're reusable in many contexts, and not too high-level, so they convey precise information. As a proof of concept, we demonstrate how a "length neologism" enables controlling LLM response length, while a "diversity neologism" allows sampling more variable responses. Taken together, we argue that we cannot understand AI using our existing vocabulary, and expanding it through neologisms creates opportunities for both controlling and understanding machines better.

Summary

AI-Generated Summary

PDF104February 17, 2025