Back to roadmap
GenAI security

GenAI Security Study Plan (GenAI/LLM)

End-to-end roadmap for LLM pentesting, GenAI security assessment, secure architecture, governance, and AI-specific threat modeling.

This plan focuses on security-first GenAI learning, not core ML research depth. It is designed to build practical capability in LLM security testing, secure implementation patterns, risk management, and responsible AI controls.

Expected pace

6-9 months

Fast-evolving domain. Revisit controls and tooling regularly.

Focus areas

LLM pentesting and GenAI security assessments.
Secure design for RAG, fine-tuning, and inference systems.
Governance, compliance, and AI risk management.
Guardrails, safety, monitoring, and incident response.

In short

Field is evolving quickly, keep learning loop active.
Use OWASP LLM Top 10 and NIST AI RMF as baseline references.
Threat model AI systems and assess business impact.
Treat prompts, models, tools, data, and agents as attack surfaces.
Combine manual testing, automation, and policy controls.
GenAI/LLM Fundamental ConceptsPrompt EngineeringRAG (Retrieval Augmented Generation)Fine TuningAI AgentsAgentic AIMCP (Model Context Protocol)CertificationsGenAI Interview QuestionsGenAI Security ToolsAdditional Resources

Section 1

GenAI/LLM Fundamental Concepts

4 weeks

Build strong foundation in LLM architecture, common attack vectors, governance frameworks, and AI threat modeling.

Week 1: AI/ML Foundations and LLM Basics

Understand AI vs ML vs deep learning vs GenAI.
Foundation models and transformer architecture basics.
Attention mechanisms, tokenization, embeddings, positional encoding.
Pre-training versus fine-tuning concepts.
Model landscape: GPT, Claude, Llama, Gemini, open-source versus proprietary.
What are Foundation ModelsGenerative AI with LLMs (Coursera)Illustrated Transformer

Week 2: LLM Security Fundamentals

Study OWASP LLM Top 10 risk categories in depth.
Prompt injection, insecure output handling, data poisoning, model DoS.
Supply chain vulnerabilities, sensitive data disclosure, excessive agency.
Model theft, plugin risks, overreliance concerns.
Attack vectors: jailbreaking, model extraction, adversarial examples, membership inference.
OWASP Top 10 for LLM ApplicationsPrompt Injection and Jailbreaking

Week 3: AI Governance and Compliance

NIST AI RMF and AI RMF Playbook essentials.
EU AI Act high-level obligations.
AI ethics: bias, transparency, privacy, accountability, human oversight.
NIST AI RMFEU AI ActNIST AI RMF Playbook

Week 4: Threat Modeling and Risk Assessment

AI-specific threat modeling methods and trust boundaries.
Risk scoring for business and technical impact.
Adversarial ML and failure-mode analysis.
Microsoft AI/ML Threat ModelingAI Threat Modeling (Matillion)NIST Adversarial MLFailure Modes in MLQuick AI Threat Model Check

Section 2

Prompt Engineering

1 week

Learn prompt design and prompt-defense patterns for secure LLM interactions.

Prompt styles: zero-shot, few-shot, chain-of-thought aware usage.
Security-focused prompting: defensive templates and constrained outputs.
Prompt injection classes: direct, indirect, jailbreak, prompt leakage.
Defenses: prompt parameterization, input validation, output filtering, role boundaries.
Hands-on: create secure prompt templates and red-team them.

Section 3

RAG (Retrieval Augmented Generation)

1-2 weeks

Understand RAG architecture and RAG-specific security risks and controls.

Week 1: RAG Fundamentals

RAG components: retrieval, knowledge base, generation.
Embeddings, vector databases, chunking strategy trade-offs.
Simple vs advanced RAG and hybrid search patterns.
RAG: The Essential GuideWhy RAG is Revolutionising GenAI

Week 2: RAG Security (Optional Deep Dive)

RAG risks: context injection, permission bypass, info leakage.
Secure doc processing, access control, retrieval filtering.
Riding the RAG TrailSecurity Risks with RAGMitigating RAG Risks

Section 4

Fine Tuning

2 weeks

Learn fine-tuning workflows and security risks in training data and model adaptation.

Week 1: Fine-Tuning Fundamentals

Pre-training vs fine-tuning vs prompt engineering decisions.
Fine-tuning methods: full FT, LoRA, QLoRA.
SFT, RLHF, and constitutional approaches.

Week 2: Fine-Tuning Security

Training data security, privacy, and poisoning risks.
Backdoor attacks and model extraction concerns.
Secure training environment, data validation, provenance, and testing.

Section 5

AI Agents

1 week

Cover agent patterns and security controls for tool use, memory, and action boundaries.

Agent fundamentals: ReAct, plan-and-execute, multi-agent basics.
Tool calling and state or memory security implications.
Risks: excessive agency, privilege escalation, unsafe tool invocation.
Defenses: least privilege, action approval workflows, behavior monitoring.

Section 6

Agentic AI

1 week

Study autonomous systems, emergent behavior risks, and governance controls.

Autonomous decision making and goal-driven behavior.
Inter-agent trust and communication security.
Risks: goal misalignment, specification gaming, emergent behavior.
Governance: constraints, oversight, intervention, auditing.

Section 7

MCP (Model Context Protocol)

1 week

Understand MCP architecture and secure context/tool integration patterns.

MCP basics: client-server patterns, resource and tool discovery.
MCP deployment: secure server setup and controlled context sharing.
Security: authN/authZ, access control, network and transport protections.
Operational controls: monitoring, logging, and MCP incident response.

Section 9

GenAI Interview Questions

Practice technical, scenario, governance, and incident-response AI security interview depth.

Explain transformer architecture and security implications.
Walk through OWASP LLM Top 10 with practical examples.
How would you test prompt injection and jailbreak resilience?
What are security controls for RAG and fine-tuning pipelines?
How would you investigate LLM data leakage incidents?
How does EU AI Act and NIST AI RMF shape deployment decisions?

Section 10

GenAI Security Tools

Survey open-source and commercial tools for scanning, guardrails, monitoring, and testing.

LLM GuardLLM Guard PlaygroundModelScanAI ExploitsGarakPromptFooHuntr AI/ML Bug BountyPortSwigger Web LLM Attacks
Lakera Guard, Robust Intelligence, WhyLabs for enterprise monitoring.
Cloud-native controls: AWS Bedrock Guardrails, Azure AI Content Safety, Google Cloud AI security controls.
Implementation checklist: tool evaluation, filtering, scanning in CI/CD, incident response playbooks.