Llama-3.1-基礎AI-安全LLM-基礎-8B 技術報告
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
April 28, 2025
作者: Paul Kassianik, Baturay Saglam, Alexander Chen, Blaine Nelson, Anu Vellore, Massimo Aufiero, Fraser Burch, Dhruv Kedia, Avi Zohary, Sajana Weerawardhena, Aman Priyanshu, Adam Swanda, Amy Chang, Hyrum Anderson, Kojin Oshiba, Omar Santos, Yaron Singer, Amin Karbasi
cs.AI
摘要
隨著基於Transformer架構的大型語言模型(LLMs)日益深入社會,它們已在軟體工程、創意寫作和數位藝術等領域引發革命性變革。然而,在網路安全領域的應用仍受限於專業訓練數據的稀缺性及網路安全特定知識表示的複雜性等挑戰。為填補這些空白,我們推出了Foundation-Sec-8B,這是一款專注於網路安全的LLM,基於Llama 3.1架構構建,並通過在精心策劃的網路安全語料庫上進行持續預訓練而增強。我們在既有的及新設的網路安全基準測試中對Foundation-Sec-8B進行了評估,結果顯示其在某些網路安全特定任務上可與Llama 3.1-70B及GPT-4o-mini相媲美。通過向公眾發布我們的模型,我們旨在加速AI驅動工具在公共和私人網路安全領域的進展與應用。
English
As transformer-based large language models (LLMs) increasingly permeate
society, they have revolutionized domains such as software engineering,
creative writing, and digital arts. However, their adoption in cybersecurity
remains limited due to challenges like scarcity of specialized training data
and complexity of representing cybersecurity-specific knowledge. To address
these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on
the Llama 3.1 architecture and enhanced through continued pretraining on a
carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across
both established and new cybersecurity benchmarks, showing that it matches
Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By
releasing our model to the public, we aim to accelerate progress and adoption
of AI-driven tools in both public and private cybersecurity contexts.