Llama-3.1-基础AI安全大模型-8B技术报告
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
April 28, 2025
作者: Paul Kassianik, Baturay Saglam, Alexander Chen, Blaine Nelson, Anu Vellore, Massimo Aufiero, Fraser Burch, Dhruv Kedia, Avi Zohary, Sajana Weerawardhena, Aman Priyanshu, Adam Swanda, Amy Chang, Hyrum Anderson, Kojin Oshiba, Omar Santos, Yaron Singer, Amin Karbasi
cs.AI
摘要
随着基于Transformer架构的大型语言模型(LLMs)日益深入社会,它们已在软件工程、创意写作和数字艺术等领域引发革命。然而,在网络安全领域的应用却因专业训练数据稀缺及网络安全知识表示复杂等挑战而受限。为填补这些空白,我们推出了Foundation-Sec-8B,这是一款专注于网络安全的大型语言模型,基于Llama 3.1架构,并通过在精心筛选的网络安全语料库上持续预训练得以增强。我们评估了Foundation-Sec-8B在传统及新兴网络安全基准测试中的表现,结果显示其在特定网络安全任务上可与Llama 3.1-70B和GPT-4o-mini相媲美。通过公开我们的模型,我们旨在加速人工智能驱动工具在公共和私营网络安全环境中的进步与采纳。
English
As transformer-based large language models (LLMs) increasingly permeate
society, they have revolutionized domains such as software engineering,
creative writing, and digital arts. However, their adoption in cybersecurity
remains limited due to challenges like scarcity of specialized training data
and complexity of representing cybersecurity-specific knowledge. To address
these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on
the Llama 3.1 architecture and enhanced through continued pretraining on a
carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across
both established and new cybersecurity benchmarks, showing that it matches
Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By
releasing our model to the public, we aim to accelerate progress and adoption
of AI-driven tools in both public and private cybersecurity contexts.