Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technisch Rapport

Samenvatting

Wij presenteren Foundation-Sec-8B-Reasoning, het eerste open-source native redeneermodel voor cybersecurity. Gebouwd op ons eerder vrijgegeven Foundation-Sec-8B-basismodel (afgeleid van Llama-3.1-8B-Base), is het model getraind via een tweefasenproces dat supervised fine-tuning (SFT) combineert met reinforcement learning from verifiable rewards (RLVR). Onze training maakt gebruik van propriëtaire redeneergegevens die cybersecurity-analyse, instructie-opvolging en wiskundig redeneren omvatten. Evaluatie over 10 cybersecurity- en 10 algemene benchmarks toont prestaties die concurrerend zijn met aanzienlijk grotere modellen voor cybersecuritytaken, terwijl sterke algemene capaciteiten behouden blijven. Het model toont effectieve generalisatie bij multi-hop redeneertaken en sterke veiligheidsprestaties wanneer het wordt ingezet met geschikte systeemprompts en guardrails. Dit werk demonstreert dat domein-gespecialiseerde redeneermodellen sterke prestaties kunnen bereiken op gespecialiseerde taken, terwijl ze brede algemene capaciteiten behouden. Wij publiceren het model op https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

English

We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technisch Rapport

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

Samenvatting

Support