ChatPaper.aiChatPaper

AION-1:天文科学全模态基础模型

AION-1: Omnimodal Foundation Model for Astronomical Sciences

October 20, 2025
作者: Liam Parker, Francois Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, Leopoldo Sarra, Lucas Meyer, Micah Bowles, Sebastian Wagner-Carena, Helen Qu, Siavash Golkar, Alberto Bietti, Hatim Bourfoune, Nathan Casserau, Pierre Cornette, Keiya Hirashima, Geraud Krawezik, Ruben Ohana, Nicholas Lourie, Michael McCabe, Rudy Morel, Payel Mukhopadhyay, Mariel Pettee, Bruno Regaldo-Saint Blancard, Kyunghyun Cho, Miles Cranmer, Shirley Ho
cs.AI

摘要

尽管基础模型已在多个领域展现出潜力,天文学领域仍缺乏一个统一的框架来对其高度多样化的数据模态进行联合建模。本文介绍了AION-1,一个面向天文学的大规模多模态基础模型家族。AION-1采用两阶段架构整合了异质成像、光谱和标量数据:首先进行模态特定的标记化处理,随后基于Transformer对跨模态标记序列进行掩码建模。该模型在五大天文巡天项目上进行了预训练,包括Legacy Survey、Hyper Suprime-Cam (HSC)、Sloan Digital Sky Survey (SDSS)、Dark Energy Spectroscopic Instrument (DESI)和Gaia,涵盖了超过2亿颗恒星、星系和类星体的观测数据。仅使用一个冻结的编码器,AION-1在广泛的下游任务中取得了优异成果,包括星系与恒星属性估计、星系形态分类、基于相似性的检索、星系图像分割以及光谱超分辨率。我们发布了参数规模从3亿到31亿不等的AION-1模型变体。超越天文学范畴,AION-1为多模态科学基础模型提供了一个可扩展的蓝图,能够无缝整合带有噪声、特定于仪器的观测数据。所有代码、标记器、预训练权重及轻量级评估套件均以开源许可证形式发布。
English
While foundation models have shown promise across a variety of fields, astronomy still lacks a unified framework for joint modeling across its highly diverse data modalities. In this paper, we present AION-1, a family of large-scale multimodal foundation models for astronomy. AION-1 integrates heterogeneous imaging, spectroscopic, and scalar data using a two-stage architecture: modality-specific tokenization followed by transformer-based masked modeling of cross-modal token sequences. The model is pretrained on five large-scale surveys: Legacy Survey, Hyper Suprime-Cam (HSC), Sloan Digital Sky Survey (SDSS), Dark Energy Spectroscopic Instrument (DESI), and Gaia. These span more than 200 million observations of stars, galaxies, and quasars. With a single frozen encoder, AION-1 achieves strong results on a broad suite of downstream tasks, including galaxy and stellar property estimation, galaxy morphology classification, similarity-based retrieval, galaxy image segmentation, and spectral super-resolution. We release AION-1 model variants ranging from 300 M to 3.1 B parameters. Beyond astronomy, AION-1 provides a scalable blueprint for multimodal scientific foundation models that can seamlessly integrate noisy, instrument-specific observations. All code, tokenizers, pretrained weights, and a lightweight evaluation suite are released under an open-source license.
PDF252October 22, 2025