Home
Softono
PentestGPT

PentestGPT

Open source MIT Python
13.6K
Stars
2.4K
Forks
56
Issues
325
Watchers
1 week
Last Commit

About PentestGPT

Automated Penetration Testing Agentic Framework Powered by Large Language Models

Platforms

Web Self-hosted

Languages

Python

Links

Contributors Forks Stargazers Issues MIT License Discord


PentestGPT

AI-Powered Autonomous Penetration Testing Agent
Published at USENIX Security 2024

Official Website: pentestgpt.com »

Research Paper · Report Bug · Request Feature

GreyDGL%2FPentestGPT | Trendshift


Demo

Installation

Installation Demo

Watch on YouTube

PentestGPT in Action

PentestGPT Demo

Watch on YouTube


What's New in v1.0 (Agentic Upgrade)

  • Iteration Loop - The agent runs continuously, maintains a context file with progress, and restarts with prior context when hitting limits. Loop terminates on flag capture or max iterations.
  • Autonomous Agent - Agentic pipeline for intelligent, autonomous penetration testing
  • Session Persistence - Save and resume penetration testing sessions

Multi-model support is available today in the interactive modernized legacy mode (pentestgpt-legacy) — OpenAI, Anthropic, Google Gemini, DeepSeek, xAI, Qwen, Moonshot, and local Ollama. See Interactive Multi-LLM Mode.


Features

  • AI-Powered Challenge Solver - Leverages LLM advanced reasoning to perform penetration testing and CTFs
  • Live Walkthrough - Tracks steps in real-time as the agent works through challenges
  • Multi-Category Support - Web, Crypto, Reversing, Forensics, PWN, Privilege Escalation
  • Real-Time Feedback - Watch the AI work with live activity updates
  • Extensible Architecture - Clean, modular design ready for future enhancements

Quick Start

Prerequisites

  • Python 3.12+
  • uv - Python package manager
  • Claude Code CLI (claude) - installed and authenticated. See Claude Code docs

Installation

git clone https://github.com/GreyDGL/PentestGPT.git
cd PentestGPT
make install    # runs uv sync

Commands Reference

Command Description
make install Install dependencies
make test Run all tests
make check Run lint + typecheck
make build Build distributable package

Usage

# Run against a target
pentestgpt --target 10.10.11.234

# With challenge context
pentestgpt --target 10.10.11.50 --instruction "WordPress site, focus on plugin vulnerabilities"

# Limit iterations
pentestgpt --target 10.10.11.234 --max-iterations 5

The agent runs in an iteration loop: it works autonomously, maintains a context file with progress, and restarts with prior context when hitting limits. The loop terminates on flag capture or max iterations (default: 10).


Interactive Multi-LLM Mode (modernized legacy)

The classic, human-in-the-loop PentestGPT from the USENIX 2024 paper is preserved and modernized as pentestgpt-legacy. It runs three cooperating LLM sessions — reasoning / generation / parsing — that maintain a Pentesting Task Tree (PTT) while you drive the session interactively (next, more, todo, discuss). Unlike the autonomous agent (Claude-only), this mode talks natively to many providers via their official SDKs.

Configure providers

Set an API key for any provider you want to use (in your environment or .env — see .env.example). Only the providers you configure are enabled.

OPENAI_API_KEY=...        ANTHROPIC_API_KEY=...     GEMINI_API_KEY=...   # or GOOGLE_API_KEY
DEEPSEEK_API_KEY=...      GROK_API_KEY=...          QWEN_API_KEY=...     KIMI_API_KEY=...

Run

# Auto-pick the best available models for each session
pentestgpt-legacy

# Choose models per session
pentestgpt-legacy --reasoning-model claude-opus-4-8 --parsing-model gemini-3.5-flash

# Local model via Ollama (OpenAI-compatible)
pentestgpt-legacy --reasoning-model ollama:qwen3 --base-url http://localhost:11434/v1

# List every supported model (shows which providers are configured)
pentestgpt-legacy --list-models

# Live round-trip every configured model and print a pass/fail matrix
pentestgpt-legacy --smoke-test

Supported models (web-verified June 2026)

pentestgpt-legacy --list-models always renders the live registry. Re-run --smoke-test after model IDs change. Current snapshot:

Provider Current models Legacy (kept) Env key
OpenAI gpt-5.5, gpt-5.5-pro, gpt-5.4-mini, gpt-5.4-nano, gpt-5.2, gpt-5.3-codex gpt-4o, gpt-4o-mini, o3, o4-mini OPENAI_API_KEY
Anthropic claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5-20251001 ANTHROPIC_API_KEY
Google Gemini gemini-3.1-pro, gemini-3.5-flash, gemini-3-pro, gemini-3.1-flash-lite gemini-2.5-pro, gemini-2.5-flash GEMINI_API_KEY / GOOGLE_API_KEY
DeepSeek deepseek-v4-flash, deepseek-v4-pro deepseek-chat, deepseek-reasoner DEEPSEEK_API_KEY
xAI Grok grok-4.3 GROK_API_KEY / XAI_API_KEY
Alibaba Qwen qwen3.7-max, qwen3.5-flash qwen3-max QWEN_API_KEY / DASHSCOPE_API_KEY
Moonshot Kimi kimi-k2.6 KIMI_API_KEY (.cn default; set MOONSHOT_BASE_URL for .ai)
Local (Ollama) ollama:<model> (e.g. ollama:qwen3) none (OLLAMA_BASE_URL)

The registry lives in pentestgpt_legacy/llm/registry.py (the single source of truth). Adding a model is one ModelSpec entry; OpenAI-compatible providers reuse one connector.


Telemetry

PentestGPT collects anonymous usage data to help improve the tool. This data is sent to our Langfuse project and includes:

  • Session metadata (target type, duration, completion status)
  • Tool execution patterns (which tools are used, not the actual commands)
  • Flag detection events (that a flag was found, not the flag content)

No sensitive data is collected - command outputs, credentials, or actual flag values are never transmitted.

Opting Out

# Via command line flag
pentestgpt --target 10.10.11.234 --no-telemetry

# Via environment variable
export LANGFUSE_ENABLED=false

Benchmarks

PentestGPT achieved an 86.5% success rate (90/104 benchmarks) on the XBOW validation suite:

  • Cost: Average $1.11, Median $0.42 per successful benchmark
  • Time: Average 6.1 minutes, Median 3.3 minutes per successful benchmark
  • Success rates by difficulty:
    • Level 1: 91.1%
    • Level 2: 74.5%
    • Level 3: 62.5%

Citation

If you use PentestGPT in your research, please cite our paper:

@inproceedings{299699,
  author = {Gelei Deng and Yi Liu and Víctor Mayoral-Vilches and Peng Liu and Yuekang Li and Yuan Xu and Tianwei Zhang and Yang Liu and Martin Pinzger and Stefan Rass},
  title = {{PentestGPT}: Evaluating and Harnessing Large Language Models for Automated Penetration Testing},
  booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
  year = {2024},
  isbn = {978-1-939133-44-1},
  address = {Philadelphia, PA},
  pages = {847--864},
  url = {https://www.usenix.org/conference/usenixsecurity24/presentation/deng},
  publisher = {USENIX Association},
  month = aug
}

License

Distributed under the MIT License. See LICENSE.md for more information.

Disclaimer: This tool is for educational purposes and authorized security testing only. The authors do not condone any illegal use. Use at your own risk.


Acknowledgments

(back to top)