GenAI security

GenAI Security Study Plan (GenAI/LLM)

End-to-end roadmap for LLM pentesting, GenAI security assessment, secure architecture, governance, and AI-specific threat modeling.

This plan focuses on security-first GenAI learning, not core ML research depth. It is designed to build practical capability in LLM security testing, secure implementation patterns, risk management, and responsible AI controls.

Expected pace

6-9 months

Fast-evolving domain. Revisit controls and tooling regularly.

Focus areas

LLM pentesting and GenAI security assessments.

Secure design for RAG, fine-tuning, and inference systems.

Governance, compliance, and AI risk management.

Guardrails, safety, monitoring, and incident response.

In short

Field is evolving quickly, keep learning loop active.

Use OWASP LLM Top 10 and NIST AI RMF as baseline references.

Threat model AI systems and assess business impact.

Treat prompts, models, tools, data, and agents as attack surfaces.

Combine manual testing, automation, and policy controls.

GenAI/LLM Fundamental Concepts Prompt Engineering RAG (Retrieval Augmented Generation)Fine Tuning AI Agents Agentic AI MCP (Model Context Protocol)Certifications GenAI Interview Questions GenAI Security Tools Additional Resources

Section 1

GenAI/LLM Fundamental Concepts

4 weeks

Build strong foundation in LLM architecture, common attack vectors, governance frameworks, and AI threat modeling.

Week 1: AI/ML Foundations and LLM Basics

Understand AI vs ML vs deep learning vs GenAI.

Foundation models and transformer architecture basics.

Attention mechanisms, tokenization, embeddings, positional encoding.

Pre-training versus fine-tuning concepts.

Model landscape: GPT, Claude, Llama, Gemini, open-source versus proprietary.

What are Foundation Models Generative AI with LLMs (Coursera)Illustrated Transformer

Week 2: LLM Security Fundamentals

Study OWASP LLM Top 10 risk categories in depth.

Prompt injection, insecure output handling, data poisoning, model DoS.

Supply chain vulnerabilities, sensitive data disclosure, excessive agency.

Model theft, plugin risks, overreliance concerns.

Attack vectors: jailbreaking, model extraction, adversarial examples, membership inference.

OWASP Top 10 for LLM Applications Prompt Injection and Jailbreaking

Week 3: AI Governance and Compliance

NIST AI RMF and AI RMF Playbook essentials.

EU AI Act high-level obligations.

AI ethics: bias, transparency, privacy, accountability, human oversight.

NIST AI RMF EU AI Act NIST AI RMF Playbook

Week 4: Threat Modeling and Risk Assessment

AI-specific threat modeling methods and trust boundaries.

Risk scoring for business and technical impact.

Adversarial ML and failure-mode analysis.

Microsoft AI/ML Threat Modeling AI Threat Modeling (Matillion)NIST Adversarial ML Failure Modes in ML Quick AI Threat Model Check

Section 2

Prompt Engineering

1 week

Learn prompt design and prompt-defense patterns for secure LLM interactions.

Prompt styles: zero-shot, few-shot, chain-of-thought aware usage.

Security-focused prompting: defensive templates and constrained outputs.

Prompt injection classes: direct, indirect, jailbreak, prompt leakage.

Defenses: prompt parameterization, input validation, output filtering, role boundaries.

Hands-on: create secure prompt templates and red-team them.

Section 3

RAG (Retrieval Augmented Generation)

1-2 weeks

Understand RAG architecture and RAG-specific security risks and controls.

Week 1: RAG Fundamentals

RAG components: retrieval, knowledge base, generation.

Embeddings, vector databases, chunking strategy trade-offs.

Simple vs advanced RAG and hybrid search patterns.

RAG: The Essential Guide Why RAG is Revolutionising GenAI

Week 2: RAG Security (Optional Deep Dive)

RAG risks: context injection, permission bypass, info leakage.

Secure doc processing, access control, retrieval filtering.

Riding the RAG Trail Security Risks with RAG Mitigating RAG Risks

Section 4

Fine Tuning

2 weeks

Learn fine-tuning workflows and security risks in training data and model adaptation.

Week 1: Fine-Tuning Fundamentals

Pre-training vs fine-tuning vs prompt engineering decisions.

Fine-tuning methods: full FT, LoRA, QLoRA.

SFT, RLHF, and constitutional approaches.

Week 2: Fine-Tuning Security

Training data security, privacy, and poisoning risks.

Backdoor attacks and model extraction concerns.

Secure training environment, data validation, provenance, and testing.

Section 5

AI Agents

1 week

Cover agent patterns and security controls for tool use, memory, and action boundaries.

Agent fundamentals: ReAct, plan-and-execute, multi-agent basics.

Tool calling and state or memory security implications.

Risks: excessive agency, privilege escalation, unsafe tool invocation.

Defenses: least privilege, action approval workflows, behavior monitoring.

Section 6

Agentic AI

1 week

Study autonomous systems, emergent behavior risks, and governance controls.

Autonomous decision making and goal-driven behavior.

Inter-agent trust and communication security.

Risks: goal misalignment, specification gaming, emergent behavior.

Governance: constraints, oversight, intervention, auditing.

Section 7

MCP (Model Context Protocol)

1 week

Understand MCP architecture and secure context/tool integration patterns.

MCP basics: client-server patterns, resource and tool discovery.

MCP deployment: secure server setup and controlled context sharing.

Security: authN/authZ, access control, network and transport protections.

Operational controls: monitoring, logging, and MCP incident response.

Section 8

Certifications

Choose cert path by role goal: pentesting, cloud AI security, governance, engineering.

Certified AI/ML Pentester

AWS ML Specialty, GCP Professional ML Engineer, Azure AI Engineer.

AttackIQ Foundations of AI Security AI for Cybersecurity Specialization IBM GenAI for Cybersecurity Professionals

Section 9

GenAI Interview Questions

Practice technical, scenario, governance, and incident-response AI security interview depth.

Explain transformer architecture and security implications.

Walk through OWASP LLM Top 10 with practical examples.

How would you test prompt injection and jailbreak resilience?

What are security controls for RAG and fine-tuning pipelines?

How would you investigate LLM data leakage incidents?

How does EU AI Act and NIST AI RMF shape deployment decisions?

Section 10

GenAI Security Tools

Survey open-source and commercial tools for scanning, guardrails, monitoring, and testing.

LLM Guard LLM Guard Playground ModelScan AI Exploits Garak PromptFoo Huntr AI/ML Bug Bounty PortSwigger Web LLM Attacks

Lakera Guard, Robust Intelligence, WhyLabs for enterprise monitoring.

Cloud-native controls: AWS Bedrock Guardrails, Azure AI Content Safety, Google Cloud AI security controls.

Implementation checklist: tool evaluation, filtering, scanning in CI/CD, incident response playbooks.

Section 11

Additional Resources

Keep continuous learning loop through courses, checklists, blogs, tools, and CTFs.

Stanford CS324 Princeton COS 597G Coursera Generative AI Engineering with LLMs OWASP LLM Security and Governance Checklist LLM Security Portal Gandalf Challenge Prompt Airlines CTF WhyLabs Intro to LLM Security