未来分布不均：大语言模型预测能力因问题类型而异

摘要

大型语言模型在预测社会、政治及经济事件方面展现出部分能力，但其预测性能会因领域结构和提示框架的不同而呈现显著差异。本研究针对模型截止日期后发生的真实事件，探究不同模型家族的预测表现差异。我们系统分析了背景信息、问题类型及外部知识如何影响预测准确性与校准度，并探讨事实性新闻背景的引入如何改变信念形成机制与错误模式。研究结果表明，预测能力具有高度可变性，其表现取决于预测内容本身及提问方式。

English

Large Language Models (LLMs) demonstrate partial forecasting competence across social, political, and economic events. Yet, their predictive ability varies sharply with domain structure and prompt framing. We investigate how forecasting performance varies with different model families on real-world questions about events that happened beyond the model cutoff date. We analyze how context, question type, and external knowledge affect accuracy and calibration, and how adding factual news context modifies belief formation and failure modes. Our results show that forecasting ability is highly variable as it depends on what, and how, we ask.

未来分布不均：大语言模型预测能力因问题类型而异

Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking

摘要

Support