Track Focus
AI agents for malware analysis
Reverse engineering, MCP server design, benchmarking, and real-world binary analysis using AI agent harnesses.
This track is sponsored by an industry expert bringing real-world malware analysis experience.
Industry Sponsor
Caleb Fenton
Co-Founder & CEO, Delphos Labs
Caleb brings deep expertise in malware analysis, reverse engineering, and security tooling. He is the driving force behind this track’s focus on AI-assisted binary analysis and MCP tool design.
Explore how AI agents and MCP tooling can improve the speed, accuracy, and cost of real-world malware analysis.
Existing Ghidra MCP integrations are useful but limited. This project will explore how to make them more effective for real-world reverse engineering workflows. Interns may either build a new MCP server from scratch or improve existing open-source MCP tools by consolidating tool interfaces, improving tool descriptions, reducing token usage, and optimizing the agent’s ability to analyze binaries efficiently.
Interns will work with a range of binary samples, including simple learning exercises, harder malware samples, supply chain attack examples such as the xz-utils backdoor, and advanced benchmark sets such as BinaryAudit. The goal is to understand how different tool designs, model choices, and agent harnesses affect analysis quality, speed, cost, and reliability.
Hands-on engineering across agent tooling, binary analysis, and benchmarking.
Agent Harnesses
Experiment with adding MCP servers to agent harnesses such as Claude Code, Codex, OpenCode, or a custom terminal-based interface.
Evaluate how tool surface area, tool descriptions, and server implementation choices affect performance. Compare many narrow tools versus fewer general tools, test local models against hosted models, or rewrite a Python MCP server in Rust for speed and reliability.
Test-and-Measure Engineering
A major part of this project is rigorous benchmarking and evaluation.
Help define benchmark samples, run repeatable evaluations, compare models and tool configurations, and document what works best. Explore integrations with tools such as Ghidra, Mandiant Speakeasy, or open-source sandboxing systems.
Concrete, shareable artifacts that demonstrate your work and contributions.
Gain hands-on experience across malware analysis, AI tooling, and engineering measurement.
Reverse Engineering
Malware analysis fundamentals and Ghidra automation for binary-analysis workflows.
MCP & AI Agents
How MCP servers work, how agents use tools, and how to design effective tool descriptions.
Benchmarking
Evaluation design, measurement methodology, and performance and cost tradeoffs for AI agents.
Tool Design
Prompt and tool-description optimization, token-usage reduction, and tool-selection accuracy improvement.
Secure Practices
Safe handling of malware samples in controlled environments and responsible research methods.
View the full internship overview for program details, dates, and application information.