前沿模型中会出现空间认知吗?
Does Spatial Cognition Emerge in Frontier Models?
October 9, 2024
作者: Santhosh Kumar Ramakrishnan, Erik Wijmans, Philipp Kraehenbuehl, Vladlen Koltun
cs.AI
摘要
我们提出了SPACE,一个系统评估前沿模型中空间认知的基准。我们的基准建立在几十年的认知科学研究基础之上。它评估了大规模地图绘制能力,这种能力在生物体穿越物理环境时发挥作用,以及关于物体形状和布局的小规模推理,以及空间注意力和记忆等认知基础设施。对于许多任务,我们通过文本和图像实例化并行呈现,使我们能够评估大型语言模型和大型多模型模型。结果表明,当代前沿模型在动物的空间智能方面表现不佳,在许多经典动物认知测试中表现接近机会水平。
English
Not yet. We present SPACE, a benchmark that systematically evaluates spatial
cognition in frontier models. Our benchmark builds on decades of research in
cognitive science. It evaluates large-scale mapping abilities that are brought
to bear when an organism traverses physical environments, smaller-scale
reasoning about object shapes and layouts, and cognitive infrastructure such as
spatial attention and memory. For many tasks, we instantiate parallel
presentations via text and images, allowing us to benchmark both large language
models and large multimodal models. Results suggest that contemporary frontier
models fall short of the spatial intelligence of animals, performing near
chance level on a number of classic tests of animal cognition.Summary
AI-Generated Summary