Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technisch Rapport
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report
January 28, 2026
Auteurs: Zhuoran Yang, Ed Li, Jianliang He, Aman Priyanshu, Baturay Saglam, Paul Kassianik, Sajana Weerawardhena, Anu Vellore, Blaine Nelson, Neusha Javidnia, Arthur Goldblatt, Fraser Burch, Avi Zohary, Assaf Eisenman, Mahdi Sabbaghi, Supriti Vijay, Rahim Dharssi, Dhruv Kedia, Kojin Oshiba, Yaron Singer, Amin Karbasi
cs.AI
Samenvatting
Wij presenteren Foundation-Sec-8B-Reasoning, het eerste open-source native redeneermodel voor cybersecurity. Gebouwd op ons eerder vrijgegeven Foundation-Sec-8B-basismodel (afgeleid van Llama-3.1-8B-Base), is het model getraind via een tweefasenproces dat supervised fine-tuning (SFT) combineert met reinforcement learning from verifiable rewards (RLVR). Onze training maakt gebruik van propriëtaire redeneergegevens die cybersecurity-analyse, instructie-opvolging en wiskundig redeneren omvatten. Evaluatie over 10 cybersecurity- en 10 algemene benchmarks toont prestaties die concurrerend zijn met aanzienlijk grotere modellen voor cybersecuritytaken, terwijl sterke algemene capaciteiten behouden blijven. Het model toont effectieve generalisatie bij multi-hop redeneertaken en sterke veiligheidsprestaties wanneer het wordt ingezet met geschikte systeemprompts en guardrails. Dit werk demonstreert dat domein-gespecialiseerde redeneermodellen sterke prestaties kunnen bereiken op gespecialiseerde taken, terwijl ze brede algemene capaciteiten behouden. Wij publiceren het model op https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.
English
We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.