LLM assisted penetration testing and vulnerability research

Summer 2026 Internship — Project Track 7

Prompt to Pentest: LLM-Assisted Security Testing Lab

Use LLMs to help find common vulnerabilities, interpret security tool output, and develop AI-assisted pentesting utilities in authorized lab environments.

Track Focus

AI-assisted pentesting

Reconnaissance, vulnerability scanning, scanner output triage, safe test generation, custom tool building, verification workflows, and defender-focused reporting.

Industry Sponsor

This track is sponsored by an industry partner focused on applied security testing and AI-assisted tooling.

Sponsor

Lib13 Inc.

Lib13 Inc. will provide industry context for using LLMs responsibly in penetration testing workflows, with emphasis on authorized targets, reproducible validation, and practical tooling that supports human security testers.

About This Track

Explore where LLMs help, where they fail, and how to keep AI-assisted security testing grounded in evidence.

Students will use LLMs to scan approved lab targets for common vulnerabilities, interpret security tool output, generate safe test cases, and develop small pentesting utilities with AI assistance.

The goal is not autonomous hacking. The goal is to understand how LLMs can support a responsible human tester: planning reconnaissance, explaining findings, identifying false positives, generating repeatable validation steps, and turning raw scanner output into useful remediation guidance.

What You’ll Build

A structured, safety-first workflow for LLM-assisted vulnerability discovery and tool development.

Authorized Scanning Workflow

Use LLMs to plan and interpret common security scans.

Run approved tests against intentionally vulnerable apps, lab services, or sponsor-approved targets. Use LLMs to summarize Nmap, Nikto, OWASP ZAP, Burp Suite, nuclei, Semgrep, or dependency scan output, then manually verify findings.

AI Pentesting Utilities

Develop small tools that help testers move from output to evidence.

Build utilities for recon checklists, finding triage, payload organization, safe proof-of-concept generation, reproducible validation notes, report drafting, or remediation mapping.

Tool and Model Comparison

Study emerging LLM-assisted security workflows.

Compare your lab workflow with examples such as Villager, Mythos-style vulnerability discovery reports, PentestGPT, VulnBot, RapidPen, garak, and other AI security testing tools. Focus on what they can verify, what they hallucinate, and where human review remains necessary.

Safety Harness

Define boundaries before testing begins.

Create rules for authorized scope, rate limits, non-destructive testing, secret handling, data retention, prompt logging, model output review, and escalation to mentors when a finding appears real.

Target Vulnerability Areas

Students will focus on common, teachable vulnerability classes that can be tested safely in labs.

Web application issues such as injection risks, broken access control, exposed admin paths, insecure headers, and authentication weaknesses.
Cloud and service misconfigurations such as public storage, exposed development services, permissive security groups, or missing logging.
Dependency and code findings from package scanners, SAST tools, and LLM-assisted code review.
Common network exposure patterns discovered through approved recon and service enumeration.
LLM-specific testing considerations such as prompt injection, unsafe tool use, exposed model endpoints, and weak guardrails when relevant to the lab target.

Expected Deliverables

Evidence-backed outputs that show both technical skill and responsible testing judgment.

Testing Plan

Authorized scope, target inventory, safety rules, tool list, and workflow for LLM-assisted testing.

Tool Prototype

A small AI-assisted pentesting utility for triage, validation, reporting, recon, or remediation mapping.

Finding Reports

Verified lab findings with evidence, impact, reproduction steps, false-positive analysis, and remediation guidance.

Evaluation Notes

Comparison of LLM-assisted workflows, including where models helped, failed, hallucinated, or required human review.

Safety Guide

Practical guardrails for using LLMs in pentesting without crossing authorization, privacy, or operational boundaries.