Llama-3.1-基礎AI-安全LLM-基礎-8B 技術報告

摘要

隨著基於Transformer架構的大型語言模型（LLMs）日益深入社會，它們已在軟體工程、創意寫作和數位藝術等領域引發革命性變革。然而，在網路安全領域的應用仍受限於專業訓練數據的稀缺性及網路安全特定知識表示的複雜性等挑戰。為填補這些空白，我們推出了Foundation-Sec-8B，這是一款專注於網路安全的LLM，基於Llama 3.1架構構建，並通過在精心策劃的網路安全語料庫上進行持續預訓練而增強。我們在既有的及新設的網路安全基準測試中對Foundation-Sec-8B進行了評估，結果顯示其在某些網路安全特定任務上可與Llama 3.1-70B及GPT-4o-mini相媲美。通過向公眾發布我們的模型，我們旨在加速AI驅動工具在公共和私人網路安全領域的進展與應用。

English

As transformer-based large language models (LLMs) increasingly permeate society, they have revolutionized domains such as software engineering, creative writing, and digital arts. However, their adoption in cybersecurity remains limited due to challenges like scarcity of specialized training data and complexity of representing cybersecurity-specific knowledge. To address these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on the Llama 3.1 architecture and enhanced through continued pretraining on a carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.

Llama-3.1-基礎AI-安全LLM-基礎-8B 技術報告

Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report

摘要

Support