ChatPaper.aiChatPaper

AstroLLaMA:朝向天文学领域的专业基础模型

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

September 12, 2023
作者: Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciucă, Charlie O'Neill, Ze-Chang Sun, Maja Jabłońska, Sandor Kruk, Ernest Perkowski, Jack Miller, Jason Li, Josh Peek, Kartheik Iyer, Tomasz Różański, Pranav Khetarpal, Sharaf Zaman, David Brodrick, Sergio J. Rodríguez Méndez, Thang Bui, Alyssa Goodman, Alberto Accomazzi, Jill Naiman, Jesse Cranney, Kevin Schawinski, UniverseTBD
cs.AI

摘要

大型语言模型在许多人类语言任务中表现出色,但在学术天文学等高度专业领域通常表现不佳。为了弥合这一差距,我们引入了AstroLLaMA,这是一个从LLaMA-2微调而来的70亿参数模型,使用了来自arXiv的30万多个天文学摘要。AstroLLaMA经过优化,适用于传统因果语言建模,其困惑度比Llama-2低30%,显示出明显的领域适应能力。尽管参数明显较少,我们的模型生成的文本完成和嵌入提取比最先进的基础模型更具洞察力和科学相关性。AstroLLaMA是一个强大的、面向特定领域的模型,具有广泛的微调潜力。其公开发布旨在推动以天文学为重点的研究,包括自动论文摘要和对话代理开发。
English
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
PDF170December 15, 2024