VCA-ADV-102: Adversarial Techniques II: LLM-CVE Reproduction
ADV-102 is the LLM-era variant of ADV-101's CVE-to-Tool pedagogy. ADV-101 took a memorable classical CVE, had students reproduce it end-to-end, and shipped a Burp-Suite-style reproduction tool plus coordinated-disclosure-style writeup. ADV-102 mirrors the structure for the LLM era: the primary target is CVE-2025-65106 (LangChain Jinja2 SSTI), the academy's canonical LLM-era CVE-to-Tool target. Students reproduce the CVE on a controlled local install, build a defensible reproduction tool that detects vulnerable versions, and ship a coordinated-disclosure-style report.
Course Overview
ADV-102 is the academy's LLM-era CVE-to-Tool course. The pedagogical structure mirrors ADV-101 exactly: a single named CVE; a full reproduction; a defensible tool that detects the vulnerability; a coordinated-disclosure-style report. The LLM-era variant uses CVE-2025-65106 (LangChain Jinja2 SSTI) as the target because the bug is recent enough to be relevant, well-documented enough to reproduce cleanly, generalisable enough to teach the bug class, and patched enough that the academy can offer the lab without responsible-disclosure tension.
The chapter's thesis: LLM-era CVEs land in libraries that production agentic systems depend on, not in models themselves. Students who finish ADV-102 understand that the attack surface of an agentic system is not the LLM itself; it is the prompt-rendering pipeline, the template engines, the deserialisation layers, and the tool-calling boundaries that surround the LLM. This is where the next 10 years of practical LLM security work will land.
Position relative to peer offerings. ADV-102 is the first formal curriculum at this course that pairs ADV-101's CVE-to-Tool methodology with an LLM-era CVE. Industry training (HackTheBox, OffSec) does not yet offer a comparable LLM-CVE-reproduction track. The academy's coordinated-disclosure discipline (taught explicitly in the chapter) is the part industry tooling tends to skip.
Pedagogy. ADV-102 inherits ADV-101's tone and structure. The comparing different systems carries the cross-language SSTI thread - Jinja2 in Python, Gonja in Go (CVE-2025-9556 in AI-201's curriculum), Eta in JavaScript, FreeMarker in Java. The chapter shows that the bug class generalises and that defending the Python flavour leaves the others exposed unless the same defence is applied across.
How the Course Teaches: Foundational Readings
ADV-102 carries the same paired-textbook system as ADV-101, now applied to the LLM-era CVE variant. The WAHH web-application mental model. Established in PEN-101, deepened in ADV-101 - is here applied specifically to template injection in agentic stacks. The OWASP LLM Top 10 and OWASP ASI (Agentic Security Initiative) Top 10 provide the broader taxonomy that makes CVE-2025-65106 intelligible as a representative instance of a class, not a one-off quirk.
Stuttard and Pinto's treatment of server-side template injection predates LangChain by a decade, but the mechanism it describes is structurally identical to CVE-2025-65106. Their account of how a templating engine that evaluates user-controlled input as code. Rather than treating it as data. Produces a code-execution surface explains why the Jinja2 SSTI in LangChain's prompt-template layer was discoverable in the first place. Module 3's Flask-based anchor lab reproduces a generic Jinja2 SSTI first; that ground-level reproduction is Stuttard and Pinto's chapter applied to a modern Python web stack. Students who have read Stuttard and Pinto's SSTI section arrive at Module 4's LangChain-specific reproduction understanding what they are looking for before the first payload fires.
CVE-2025-65106 is a concrete instance of OWASP LLM Top 10 item LLM01 (Prompt Injection) in the specific sense that the injection reaches a template renderer rather than the model itself. The OWASP ASI Top 10 extends this taxonomy to agentic systems: where a single LLM call has one injection surface, an agentic workflow with tool-calling and multi-step reasoning has many. Module 2's LangChain architecture trace establishes which components carry which OWASP LLM/ASI classification; the capstone report's coordinated-disclosure section explicitly maps the CVE to the relevant OWASP taxonomy items. SEC-101 alumni will recognize the OWASP framing from the AI-strand forward pointer in that course; ADV-102 is where that forward pointer lands.
Seitz and Arnold's chapter on Volatility and memory forensics frames a question that ADV-102's Module 6 tool-building work takes seriously: once a reproduction tool fires a payload, what forensic evidence does the attacker leave behind, and what evidence does the defender collect? The module's reproduction tool outputs a structured detector report; that report's forensic-trace section is where Seitz and Arnold's forensic-scripting discipline applies to an LLM-era context. The cross-language SSTI mapping tool in Module 7 uses the same instrumentation philosophy. Emit enough structured output that an independent analyst can reconstruct what the tool did without running it again.
Curriculum Outline
Ten modules across ~10 weeks.
| Module | Topic | Project |
|---|---|---|
| 1 | The CVE-to-Tool methodology, recapped from ADV-101 | 2-page mapping table comparing ADV-101 target to ADV-102 target |
| 2 | LangChain architecture & the templating pipeline | Trace a prompt through LangChain Expression Language; identify the templating step |
| 3 | Jinja2 SSTI, the bug class | Reproduce a generic Jinja2 SSTI in a Flask app to anchor the bug class |
| 4 | CVE-2025-65106, the specific instance | Pin vulnerable LangChain version; reproduce the chain |
| 5 | The patch & the defender lens | Read the upstream patch; identify the missing input validation |
| 6 | Building the reproduction tool (CVE detector) | Build a Python tool that scans a target for vulnerable versions; outputs detector report |
| 7 | Cross-language generalisation | Reproduce the Go cousin (CVE-2025-9556); pair with the Python target |
| 8 | Coordinated-disclosure discipline | Walk a hypothetical disclosure timeline; produce a vendor-readable report |
| 9 | Defensible reproduction-tool deployment | Package the tool; document; publish to a private repo for instructor review |
| 10 | Capstone. Full CVE reproduction + tool + report | Submit reproduction harness + tool + 6-8 page coordinated-disclosure-style report + 5-min recorded demo |
Learning Outcomes
- Remember. State the CVE identifier, CVSS score, affected versions, and patched version of CVE-2025-65106.
- Understand. Explain why Jinja2 SSTI is the canonical agentic-system bug class for Python-based stacks.
- Apply. Reproduce CVE-2025-65106 end-to-end on a controlled local install.
- Apply. Build a defensible reproduction tool that detects vulnerable versions of LangChain.
- Apply. Reproduce the Go cousin (CVE-2025-9556) and identify the cross-language pattern.
- Analyze. Read the upstream patch and identify the missing input validation.
- Synthesize. Ship the capstone. Reproduction harness + tool + 6-8 page coordinated-disclosure report.
Hands-On Labs
- Lab 3.1: generic Jinja2 SSTI in a Flask app (anchor the bug class).
- Lab 4.1 (signature): CVE-2025-65106 LangChain Jinja2 SSTI end-to-end.
- Lab 5.1: read the upstream patch; defender lens.
- Lab 6.1: build the reproduction tool; outputs detector report.
- Lab 7.1: reproduce CVE-2025-9556 Gonja SSTI; compare patterns.
- Lab 8.1: hypothetical coordinated-disclosure timeline walk.
- Lab 10 (capstone): reproduction harness + tool + 6-8 page report + 5-min demo.
Assessment
First, your project must work. CVE-2025-65106 reproduction works; reproduction tool detects vulnerable versions correctly; report submitted; demo recorded. Then we score the report on three dimensions (40/30/30). reproduction depth (40%) · tool defensibility & documentation (30%) · report & demo quality at coordinated-disclosure practices (30%). B− minimum on Tier 2 for the certificate.
Career Outcomes & Cross-Course Bridges
- → VCA-ADV-101. The classical-era CVE-to-Tool course; pairs with ADV-102 to give graduates both eras of CVE-to-Tool methodology.
- → VCA-AI-201. Production agentic-system pentesting at scale.
- → VCA-AI-301. Adversarial AI capstone.
- Industry. LLM-app security researcher; agentic-system pentester; CVE coordinator; LLM-app auditor.
Tool Journal: ADV-102 Originating Entries
- LangChain version-pinning workflow, the discipline of CVE-affected-version reproduction
- Burp Suite Community (deeper use), HTTP intercept against agentic endpoints
- Jinja2 SSTI payload library. Canonical SSTI patterns
- Flask testbed harness. Local app for Jinja2 anchor lab
- CVE-detector tool template, the academy's scaffolding for reproduction tools
- Vendor patch-reading workflow. Structured patch-diff analysis
- Coordinated-disclosure timeline template. Standard format
- Cross-language SSTI mapping tool, Jinja2 / Gonja / Eta / FreeMarker comparison harness
Before You Start
- Have you completed ADV-101? (If no → ADV-101 is central prereq; ADV-102 mirrors its structure.)
- Have you completed AI-101? (If no → AI-101 is central prereq; ADV-102 assumes OWASP LLM Top 10 fluency.)
- Are you comfortable installing pinned dependency versions in Python virtualenvs? (If no → FND-102 + AI-101 review.)
- Can you read CVE writeups + vendor patch diffs fluently? (If no → AI-101 review.)
- Are you familiar with the responsible-disclosure norms? (If no → SEC-101 ethics module + AI-201 Module 8.)
Recommended Readings
Primary anchor pair. Practitioner narrative (LLM-CVE depth)
- Stuttard & Pinto, The Web Application Hacker's Handbook, 2nd ed., Chs 8-9 (Wiley, 2011; ISBN 978-1-118-02647-2). The template-injection and server-side rendering chapters are the direct theoretical substrate for Module 3's Flask anchor lab and Module 4's CVE-2025-65106 reproduction. First introduced PEN-101; deepened ADV-101; applied to LLM stacks here.
- Seitz & Arnold, Black Hat Python, 2nd ed., Ch 10 + supporting chapters (No Starch Press, 2021; ISBN 978-1-7185-0112-6). Forensic-scripting and structured-output discipline for the reproduction tool's forensic-trace section and the Module 7 cross-language mapping harness.
OWASP taxonomy references (free)
- OWASP Top 10 for Large Language Model Applications (owasp.org; updated annually), LLM01 Prompt Injection through LLM10 Model Theft. ADV-102 explicitly maps CVE-2025-65106 to OWASP LLM taxonomy items in the capstone report.
- OWASP Top 10 for Agentic AI Applications (ASI Top 10) (owasp.org; 2025 release). Extends the LLM taxonomy to multi-step, tool-calling agentic systems. Module 2's architecture trace classifies each LangChain component by its ASI risk category.
Supplementary
- Yaworski, Real-World Bug Hunting (No Starch, 2019). Bug-bounty methodology; SSTI chapter maps onto ADV-102's Module 3 anchor lab.
- LangChain official documentation + CVE-2025-65106 upstream patch diff (GitHub; free), the primary source material for Modules 2-5.
Course handout
- LLM and Agentic-System Security Vocabulary Reference (Virtus Academy; free). The course-companion vocab-tier handout for ADV-102. Walks all ten OWASP LLM Top 10 (2025) and all ten OWASP ASI Top 10 entries at student-pinnable depth with CWE pairings, 19 MITRE ATLAS technique IDs, a structural worked example using CVE-2025-65106, and six agentic-system attack primitives. Complements this catalog page; deep CVE and taxonomy content lives in the handout rather than here.
Format Prescriptions
Hour budget: ~22 lec hr + ~40 lab hr + ~53 indep hr (= ~115 hr total).
Live
2 sessions/wk × 90 min over 10 weeks.
Night class
1-2 sessions/wk evenings; ~20 weeks.
Bootcamp
40 hr/wk × ~3 weeks intensive.
Async self-paced
Recorded video; AI-API budget guidance; 1:1 tutoring premium for tool development.
High school / homeschool co-op
Adapted live cadence over a semester; HS audience benefits from the pair with ADV-101.