重探柏拉圖式表徵假說:一種亞里士多德式的視角
Revisiting the Platonic Representation Hypothesis: An Aristotelian View
February 16, 2026
作者: Fabian Gröger, Shuo Wen, Maria Brbić
cs.AI
摘要
柏拉圖式表徵假說認為,神經網絡的表徵正在趨向於對現實的共同統計模型。我們發現,現有衡量表徵相似度的指標存在網絡規模干擾:增加模型深度或寬度會系統性地誇大表徵相似度評分。為修正這些影響,我們提出基於置換的零校準框架,可將任何表徵相似度指標轉化為具統計保證的校準分數。運用此校準框架重新審視柏拉圖式表徵假說,我們發現更細緻的圖景:經校準後,全局譜度量所報告的表觀趨同現象基本消失,而局部鄰域相似性(非局部距離)在不同模態間仍保持顯著一致性。基於這些發現,我們提出亞里士多德式表徵假說:神經網絡中的表徵正趨向於共享的局部鄰域關係。
English
The Platonic Representation Hypothesis suggests that representations from neural networks are converging to a common statistical model of reality. We show that the existing metrics used to measure representational similarity are confounded by network scale: increasing model depth or width can systematically inflate representational similarity scores. To correct these effects, we introduce a permutation-based null-calibration framework that transforms any representational similarity metric into a calibrated score with statistical guarantees. We revisit the Platonic Representation Hypothesis with our calibration framework, which reveals a nuanced picture: the apparent convergence reported by global spectral measures largely disappears after calibration, while local neighborhood similarity, but not local distances, retains significant agreement across different modalities. Based on these findings, we propose the Aristotelian Representation Hypothesis: representations in neural networks are converging to shared local neighborhood relationships.