Cobalt LogotypeWhite

AI and Pentesting
Pulse Report 2026

Five years of real-world pentesting data reveal a widening gap between AI risk and AI security practice. Here’s exactly what separates the organizations closing the gap from those falling behind.

AI and LLM applications produce high-risk findings at 2.7x the rate of conventional software. That ratio has held for two consecutive years. For every serious AI vulnerability that gets fixed, two more remain open and exploitable. Most organizations are testing. Far fewer are built to fix what they find.

32%
of AI pentest findings are high risk. Across every other asset class, it's 12%.

2 out of 3
high-risk AI vulnerabilities go unresolved. The lowest remediation rate of any pentest type.

42 points
separate security leaders (57%) from practitioners (15%) on whether SLAs are actually being met.

78%
of security teams have experienced false negatives from automated tools. Full automation as a preferred approach dropped 20 points in a single year.

2026-Pulse-Report_cover_tilt2
2026-Pulse-Report_cover_tilt3

The organizations remediating critical AI findings 4.5x faster than their peers don't have bigger budgets. They have better structure.

Cobalt’s 2026 AI and Pentesting Pulse Report, built from five years of pentesting data and a survey of 455 security leaders and practitioners, documents six things those organizations do differently.

A look inside the report

Treat AI and LLM pentesting as a distinct discipline

AI applications don't just inherit conventional software vulnerabilities. They layer new ones on top: prompt injection, insecure output handling, excessive agency. Folding AI testing into existing web or API programs misses the attack surface entirely. High-performing organizations build a dedicated practice with industry-standard AI security frameworks as the baseline, not a retrofitted web app checklist.

Move from reactive to programmatic testing

Organizations with continuous, structured offensive security programs are 4.5x more likely to resolve critical findings within three-day SLAs. The laggards let the same findings sit open for an average of 249 days. That gap is determined by structure, not spending.

Close the leader-practitioner divide

57% of security leaders believe their organization consistently meets remediation SLAs. Among the practitioners doing the work, 15% agree. That 42-point gap is a governance failure. High-performing programs establish shared, real-time SLA visibility so leaders and practitioners are managing the same reality.

The full report covers:

  • Why the market has turned decisively against full automation and what the hybrid testing model actually looks like in practice.
  • Why shadow AI is already the leading cause of AI security incidents and what to do about it.
  • Why 60% of security teams say they need better LLM testing capability but only 42% are planning to increase the human-led practice best positioned to deliver it, and how to close that gap before it becomes an incident.
2026-Pulse-Report_cover_tilt2
Lorem Ipsum

Lorem Ipsum

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus imperdiet accumsan vehicula. Suspendisse dictum lorem ex, at laoreet ex fermentum eu. Nunc commodo ut magna a pellentesque.