VCA-ADV-102: Adversarial Techniques II: LLM-CVE Reproduction

Belt 4/5 · deep technical

Virtus Academy · Adversarial Track

ADV-102 is the LLM-era variant of ADV-101's CVE-to-Tool pedagogy. ADV-101 took a memorable classical CVE, had students reproduce it end-to-end, and shipped a Burp-Suite-style reproduction tool plus coordinated-disclosure-style writeup. ADV-102 mirrors the structure for the LLM era: the primary target is CVE-2025-65106 (LangChain Jinja2 SSTI), the academy's canonical LLM-era CVE-to-Tool target. Students reproduce the CVE on a controlled local install, build a defensible reproduction tool that detects vulnerable versions, and ship a coordinated-disclosure-style report.

Total time: ~115 hours

Lecture: ~22 hr

Practical / lab: ~40 hr

Independent practice: ~53 hr

Position: After ADV-101 + AI-101

Prereq: ADV-101 + AI-101 (central); SEC-101 + PEN-101 strongly recommended

Equipment: Laptop with Python 3.11+, virtualenv, LangChain <= vulnerable version pinned, OpenAI/Anthropic SDK keys, Burp Suite Community; NO hardware. TIR-3 Phase-2 (forward-stretch): academy Burp/ZAP-equivalent browser-based WebAssembly tool. Full HTTP intercept proxy + scanner + repeater + intruder module. Targets replacing Burp Community for agentic-endpoint interception labs (LoE-D-E; commercial-grade scope; ships after TIR-3 Phase-1 OWASP WebGoat+DVWA validates). (see hardware platform · we update this as the kit firms up)

Credential: VCA-ADV-102 Certificate of Completion

Register interest. We're not taking enrollments yet. Email interested@virtuscyberacademy.org.

Course Overview

ADV-102 is the academy's LLM-era CVE-to-Tool course. The pedagogical structure mirrors ADV-101 exactly: a single named CVE; a full reproduction; a defensible tool that detects the vulnerability; a coordinated-disclosure-style report. The LLM-era variant uses CVE-2025-65106 (LangChain Jinja2 SSTI) as the target because the bug is recent enough to be relevant, well-documented enough to reproduce cleanly, generalisable enough to teach the bug class, and patched enough that the academy can offer the lab without responsible-disclosure tension.

The chapter's thesis: LLM-era CVEs land in libraries that production agentic systems depend on, not in models themselves. Students who finish ADV-102 understand that the attack surface of an agentic system is not the LLM itself; it is the prompt-rendering pipeline, the template engines, the deserialisation layers, and the tool-calling boundaries that surround the LLM. This is where the next 10 years of practical LLM security work will land.

Position relative to peer offerings. ADV-102 is the first formal curriculum at this course that pairs ADV-101's CVE-to-Tool methodology with an LLM-era CVE. Industry training (HackTheBox, OffSec) does not yet offer a comparable LLM-CVE-reproduction track. The academy's coordinated-disclosure discipline (taught explicitly in the chapter) is the part industry tooling tends to skip.

Pedagogy. ADV-102 inherits ADV-101's tone and structure. The comparing different systems carries the cross-language SSTI thread - Jinja2 in Python, Gonja in Go (CVE-2025-9556 in AI-201's curriculum), Eta in JavaScript, FreeMarker in Java. The chapter shows that the bug class generalises and that defending the Python flavour leaves the others exposed unless the same defence is applied across.

How the Course Teaches: Foundational Readings

ADV-102 carries the same paired-textbook system as ADV-101, now applied to the LLM-era CVE variant. The WAHH web-application mental model. Established in PEN-101, deepened in ADV-101 - is here applied specifically to template injection in agentic stacks. The OWASP LLM Top 10 and OWASP ASI (Agentic Security Initiative) Top 10 provide the broader taxonomy that makes CVE-2025-65106 intelligible as a representative instance of a class, not a one-off quirk.

Narrative weave, Stuttard & Pinto, The Web Application Hacker's Handbook Chs 8-9 (template injection and server-side rendering attacks) applied to LLM stacks.

Stuttard and Pinto's treatment of server-side template injection predates LangChain by a decade, but the mechanism it describes is structurally identical to CVE-2025-65106. Their account of how a templating engine that evaluates user-controlled input as code. Rather than treating it as data. Produces a code-execution surface explains why the Jinja2 SSTI in LangChain's prompt-template layer was discoverable in the first place. Module 3's Flask-based anchor lab reproduces a generic Jinja2 SSTI first; that ground-level reproduction is Stuttard and Pinto's chapter applied to a modern Python web stack. Students who have read Stuttard and Pinto's SSTI section arrive at Module 4's LangChain-specific reproduction understanding what they are looking for before the first payload fires.

OWASP cross-cut, LLM Top 10 (owasp.org/www-project-top-10-for-large-language-model-applications) + ASI Top 10 (OWASP Top 10 for Agentic AI Applications).

CVE-2025-65106 is a concrete instance of OWASP LLM Top 10 item LLM01 (Prompt Injection) in the specific sense that the injection reaches a template renderer rather than the model itself. The OWASP ASI Top 10 extends this taxonomy to agentic systems: where a single LLM call has one injection surface, an agentic workflow with tool-calling and multi-step reasoning has many. Module 2's LangChain architecture trace establishes which components carry which OWASP LLM/ASI classification; the capstone report's coordinated-disclosure section explicitly maps the CVE to the relevant OWASP taxonomy items. SEC-101 alumni will recognize the OWASP framing from the AI-strand forward pointer in that course; ADV-102 is where that forward pointer lands.

Narrative weave, Seitz & Arnold, Black Hat Python Ch 10 (Volatility + forensic scripting) applied to LLM-CVE post-exploitation analysis.

Seitz and Arnold's chapter on Volatility and memory forensics frames a question that ADV-102's Module 6 tool-building work takes seriously: once a reproduction tool fires a payload, what forensic evidence does the attacker leave behind, and what evidence does the defender collect? The module's reproduction tool outputs a structured detector report; that report's forensic-trace section is where Seitz and Arnold's forensic-scripting discipline applies to an LLM-era context. The cross-language SSTI mapping tool in Module 7 uses the same instrumentation philosophy. Emit enough structured output that an independent analyst can reconstruct what the tool did without running it again.

Curriculum Outline

Ten modules across ~10 weeks.

Module	Topic	Project
1	The CVE-to-Tool methodology, recapped from ADV-101	2-page mapping table comparing ADV-101 target to ADV-102 target
2	LangChain architecture & the templating pipeline	Trace a prompt through LangChain Expression Language; identify the templating step
3	Jinja2 SSTI, the bug class	Reproduce a generic Jinja2 SSTI in a Flask app to anchor the bug class
4	CVE-2025-65106, the specific instance	Pin vulnerable LangChain version; reproduce the chain
5	The patch & the defender lens	Read the upstream patch; identify the missing input validation
6	Building the reproduction tool (CVE detector)	Build a Python tool that scans a target for vulnerable versions; outputs detector report
7	Cross-language generalisation	Reproduce the Go cousin (CVE-2025-9556); pair with the Python target
8	Coordinated-disclosure discipline	Walk a hypothetical disclosure timeline; produce a vendor-readable report
9	Defensible reproduction-tool deployment	Package the tool; document; publish to a private repo for instructor review
10	Capstone. Full CVE reproduction + tool + report	Submit reproduction harness + tool + 6-8 page coordinated-disclosure-style report + 5-min recorded demo

Learning Outcomes

Remember. State the CVE identifier, CVSS score, affected versions, and patched version of CVE-2025-65106.
Understand. Explain why Jinja2 SSTI is the canonical agentic-system bug class for Python-based stacks.
Apply. Reproduce CVE-2025-65106 end-to-end on a controlled local install.
Apply. Build a defensible reproduction tool that detects vulnerable versions of LangChain.
Apply. Reproduce the Go cousin (CVE-2025-9556) and identify the cross-language pattern.
Analyze. Read the upstream patch and identify the missing input validation.
Synthesize. Ship the capstone. Reproduction harness + tool + 6-8 page coordinated-disclosure report.

Hands-On Labs

Lab 3.1: generic Jinja2 SSTI in a Flask app (anchor the bug class).
Lab 4.1 (signature): CVE-2025-65106 LangChain Jinja2 SSTI end-to-end.
Lab 5.1: read the upstream patch; defender lens.
Lab 6.1: build the reproduction tool; outputs detector report.
Lab 7.1: reproduce CVE-2025-9556 Gonja SSTI; compare patterns.
Lab 8.1: hypothetical coordinated-disclosure timeline walk.
Lab 10 (capstone): reproduction harness + tool + 6-8 page report + 5-min demo.

Assessment

First, your project must work. CVE-2025-65106 reproduction works; reproduction tool detects vulnerable versions correctly; report submitted; demo recorded. Then we score the report on three dimensions (40/30/30). reproduction depth (40%) · tool defensibility & documentation (30%) · report & demo quality at coordinated-disclosure practices (30%). B− minimum on Tier 2 for the certificate.

Career Outcomes & Cross-Course Bridges

→ VCA-ADV-101. The classical-era CVE-to-Tool course; pairs with ADV-102 to give graduates both eras of CVE-to-Tool methodology.
→ VCA-AI-201. Production agentic-system pentesting at scale.
→ VCA-AI-301. Adversarial AI capstone.
Industry. LLM-app security researcher; agentic-system pentester; CVE coordinator; LLM-app auditor.

Tool Journal: ADV-102 Originating Entries

LangChain version-pinning workflow, the discipline of CVE-affected-version reproduction
Burp Suite Community (deeper use), HTTP intercept against agentic endpoints
Jinja2 SSTI payload library. Canonical SSTI patterns
Flask testbed harness. Local app for Jinja2 anchor lab
CVE-detector tool template, the academy's scaffolding for reproduction tools
Vendor patch-reading workflow. Structured patch-diff analysis
Coordinated-disclosure timeline template. Standard format
Cross-language SSTI mapping tool, Jinja2 / Gonja / Eta / FreeMarker comparison harness

Before You Start

Have you completed ADV-101? (If no → ADV-101 is central prereq; ADV-102 mirrors its structure.)
Have you completed AI-101? (If no → AI-101 is central prereq; ADV-102 assumes OWASP LLM Top 10 fluency.)
Are you comfortable installing pinned dependency versions in Python virtualenvs? (If no → FND-102 + AI-101 review.)
Can you read CVE writeups + vendor patch diffs fluently? (If no → AI-101 review.)
Are you familiar with the responsible-disclosure norms? (If no → SEC-101 ethics module + AI-201 Module 8.)

Format Prescriptions

Hour budget: ~22 lec hr + ~40 lab hr + ~53 indep hr (= ~115 hr total).

Live

2 sessions/wk × 90 min over 10 weeks.

Night class

1-2 sessions/wk evenings; ~20 weeks.

Bootcamp

40 hr/wk × ~3 weeks intensive.

Async self-paced

Recorded video; AI-API budget guidance; 1:1 tutoring premium for tool development.

High school / homeschool co-op

Adapted live cadence over a semester; HS audience benefits from the pair with ADV-101.

Interested in VCA-ADV-102?

Email interested@virtuscyberacademy.org.

Email interested@virtuscyberacademy.org