Prithvi WxC:天气和气候基础模型
Prithvi WxC: Foundation Model for Weather and Climate
September 20, 2024
作者: Johannes Schmude, Sujit Roy, Will Trojak, Johannes Jakubik, Daniel Salles Civitarese, Shraddha Singh, Julian Kuehnert, Kumar Ankur, Aman Gupta, Christopher E Phillips, Romeo Kienzler, Daniela Szwarcman, Vishal Gaur, Rajat Shinde, Rohit Lal, Arlindo Da Silva, Jorge Luis Guevara Diaz, Anne Jones, Simon Pfreundschuh, Amy Lin, Aditi Sheshadri, Udaysankar Nair, Valentine Anantharaj, Hendrik Hamann, Campbell Watson, Manil Maskey, Tsengdar J Lee, Juan Bernabe Moreno, Rahul Ramachandran
cs.AI
摘要
由于意识到人工智能仿真器可以与运行在高性能计算系统上的传统数值天气预报模型的性能相媲美,现在有越来越多的大型人工智能模型用于解决诸如预测、降尺度或现在预报等用例。虽然人工智能文献中的并行发展侧重于基础模型——可以有效调整以解决多个不同用例的模型——但天气和气候领域的发展主要集中在特定中程预测为重点的单一用例。我们通过引入Prithvi WxC 来弥补这一差距,这是一个使用来自现代-时代回顾分析研究和应用第2版(MERRA-2)的160个变量开发的23亿参数基础模型。Prithvi WxC采用基于编码器-解码器的架构,结合了各种最近的Transformer模型的概念,以有效捕捉输入数据中的区域和全球依赖关系。该模型经过设计,以适应大量令牌计数,以在精细分辨率下对不同拓扑中的天气现象进行建模。此外,它采用了结合了掩码重建和预测范式的混合目标进行训练。我们在一组具有挑战性的下游任务上对该模型进行了测试,包括:自回归滚动预测、降尺度、重力波通量参数化和极端事件估计。具有23亿参数的预训练模型,以及相关的微调工作流程,已通过Hugging Face作为开源贡献公开发布。
English
Triggered by the realization that AI emulators can rival the performance of
traditional numerical weather prediction models running on HPC systems, there
is now an increasing number of large AI models that address use cases such as
forecasting, downscaling, or nowcasting. While the parallel developments in the
AI literature focus on foundation models -- models that can be effectively
tuned to address multiple, different use cases -- the developments on the
weather and climate side largely focus on single-use cases with particular
emphasis on mid-range forecasting. We close this gap by introducing Prithvi
WxC, a 2.3 billion parameter foundation model developed using 160 variables
from the Modern-Era Retrospective Analysis for Research and Applications,
Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture,
incorporating concepts from various recent transformer models to effectively
capture both regional and global dependencies in the input data. The model has
been designed to accommodate large token counts to model weather phenomena in
different topologies at fine resolutions. Furthermore, it is trained with a
mixed objective that combines the paradigms of masked reconstruction with
forecasting. We test the model on a set of challenging downstream tasks namely:
Autoregressive rollout forecasting, Downscaling, Gravity wave flux
parameterization, and Extreme events estimation. The pretrained model with 2.3
billion parameters, along with the associated fine-tuning workflows, has been
publicly released as an open-source contribution via Hugging Face.Summary
AI-Generated Summary