Advancing Cybersecurity: Navigating the New Landscape of AI Penetration Testing

Chenny Ren
4 min readMar 20, 2024

Software security has made significant strides in the past few decades. It’s hard to believe now, but there was a time when penetration testing was conducted only at the host/network layer, and security teams were fully aware of application-level attacks such as SQL injection and cross-site scripting. As a result, attackers began to bypass traditional defensive measures like network firewalls through application-layer attacks, leading to the emergence of controls such as Web Application Firewalls (WAFs), Appsec reviews, and source code review.

Today, software security has evolved to include technologies like serverless, API controls, and pipeline security through the DevSecOps movement.

Emerging Threats

As security teams know, hackers are notoriously clever. Once a layer of security is implemented in an application, they move on to target emerging technologies that are not as mature. Keep in mind that today’s AI-based systems are the new web applications of the past that network security teams were unaware of as a new attack surface.

For companies implementing AI-based systems, network security teams usually perform penetration testing up to the application layer but overlook a critical point:

The key feature of AI-based systems is their ability to make decisions based on the data provided. If this data and decision-making ability are compromised or stolen, the entire AI ecosystem could be at risk.

Unfortunately, most network security teams today are not aware of the new types of attacks introduced by AI systems. In these attacks, attackers manipulate unique features of how AI systems work to achieve their malicious intent.

Many business models have been manipulated or deceived, and as AI adoption grows, this type of attack will only increase.

Unique Types of AI Attacks

The new threat landscape introduced by AI is very diverse. Data may be intentionally or unintentionally “poisoned,” leading to manipulated decisions. Similarly, AI logic or data can be “inferred,” leading to data extraction. Once attackers figure out the underlying decision logic, models can be “evaded.”

Penetration Testing AI Applications

The good news is you don’t have to start from scratch. If you’ve been part of a cybersecurity team involved in penetration testing or red teaming, you might be familiar with the MITRE ATT&CK framework, a publicly accessible database of adversary tactics and techniques based on real-world observations.

This popular framework has been used to create the MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems), a knowledge base of adversarial tactics, techniques, and case studies for machine learning (ML) systems based on real-world observations, ML red teaming, security team demonstrations, and academic research.

ATLAS follows the same framework as MITRE, making it easy for cybersecurity practitioners to study and adopt its techniques when wanting to test the vulnerabilities and security risks of their internal AI systems. It also helps raise awareness of these risks within the cybersecurity community, as they are presented in a format they are already familiar with.

AI Penetration Testing Tools

Standard security tools usually do not have built-in AI-based technologies that can assess a model’s vulnerabilities to model reasoning or evasion risks. Fortunately, cybersecurity teams can supplement their existing penetration testing toolkit with free tools. These tools are open source, but commercial alternatives are also available.

Ensure that these tools have the following capabilities:

  • Model-agnostic: able to test all types of models, not limited to any specific one.
  • Technology-agnostic: capable of testing AI models hosted on any platform, whether cloud-based or on-premises.
  • Integratable with your existing toolkit: should have command-line capabilities so your security team can easily execute script writing and automation.

Some free tools you can find include Microsoft’s Counterfit and the Adversarial Robustness Toolbox (ART).

To make these tools effective, ensure they are mapped to the ATLAS framework to keep them consistent with general standards. Use these tools for red teaming/penetration testing and vulnerability assessment of AI systems. Regularly scan your AI assets with them and build a risk tracker specific to AI risks. Tracking these risks over time allows you to see improvements in security posture and monitor progress.

Another valuable resource for gaining better attack context and awareness is the Atlas case studies page listed here. This page lists known attacks against production AI systems, allowing security teams to better understand the impact on their systems.

I hope this provides a starting point for incorporating AI penetration testing into your cybersecurity efforts. Rest assured, with the significant increase in AI adoption and cybercriminals’ interest in exploiting AI, this field will rapidly become commonplace in the coming years.



Chenny Ren

OSCP | OSWP | OSEP | CRTP |CRTE | CRTO | Red Team Professional | SOC engineer