**Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B 技术报告**

摘要

我们正式发布Foundation-Sec-8B-Reasoning，这是首个面向网络安全领域的开源原生推理模型。该模型基于我们先前发布的Foundation-Sec-8B基础模型（源自Llama-3.1-8B-Base），通过结合监督微调（SFT）和可验证奖励强化学习（RLVR）的两阶段训练流程构建而成。我们的训练采用了涵盖网络安全分析、指令遵循和数学推理的专有推理数据集。在10项网络安全基准测试和10项通用基准测试中的评估表明，该模型在网络安全任务上可与规模显著更大的模型竞争，同时保持了强大的通用能力。该模型在多跳推理任务中展现出有效的泛化能力，在配合适当系统提示和防护机制部署时表现出卓越的安全性。本研究表明，领域专用推理模型能够在保持广泛通用能力的同时，在专业任务上实现强劲性能。我们已通过https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning 公开释放该模型。

English

We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B 技术报告

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

摘要

Support