No surprise. It’s out there, Malicious AI.

Malicious LLMs have been commercially available since 2023. Fine-tuned to build the next big thing in cyber warfare.
That AI can be used for bad as well as for good is no surprise. It therefore makes sense that bad actors make use of the technology to create cybersecurity exploits.
Which means that security weaknesses can be identified much faster, and exploits thereof can be created in much higher numbers now that AI has super-powered cyber criminals.
Here’s three examples of security flaws that were found and exploited by malicious AI.
1️⃣ Prompt Injection in AI Coding Tools:
Researchers uncovered over 30 flaws in AI coding assistants like GitHub Copilot, Cursor, and Roo Code. Attackers exploited prompt injection to edit workspace files, override settings, and achieve remote code execution. For instance, CVE-2025-64660 allowed malicious actors to manipulate workspace configurations and execute arbitrary code if the AI was set to auto-approve file writes.
2️⃣ Model Poisoning on AI Platforms:
Security researchers found 100 poisoned AI models on Hugging Face, designed to inject malicious code into systems that incorporated them. Similarly, PyTorch Hub hosted seemingly legitimate models that contained hidden malware, highlighting supply chain risks in AI model distribution.
3️⃣ Insecure AI API Exploitation
Research revealed that 57% of AI-powered APIs were externally accessible, and 89% relied on insecure authentication. Hackers exploited these weaknesses to hijack requests, inject malicious payloads, or overload systems.
The Lotus C2 framework, sold as a cybercrime kit early this year, automated such attacks.
Cybercrime is big business. Which is why criminals develop AI models specifically trained for exploiting security vulnerabilities.
Lotus C2 is a command-and-control (C2) framework that surfaced in early 2026, marketed as a "professional" security testing platform but widely sold in cybercrime circles. It enables credential theft, data exfiltration, and mass automated attacks. With its polished interface it looks legitimate, and beneath the surface blends in with normal network traffic.
Cybercriminals are increasingly developing and using AI models specifically for malicious purposes. An example is ExploitGen, an exploit code generator based on CodeBERT.
ExploitGen automates the creation of exploit code for known vulnerabilities, making it easier for attackers to target unpatched systems.
With such AI models, AI coding agents can now automate the entire attack chain, from reconnaissance to exploitation, operating at speeds that outpace traditional defences.
With increasingly fast-paced AI-powered cyberattacks we need to take AI-powered countermeasures.
We’re a small software company out of Amsterdam, and we’ve taken to using coding agents a lot. But we’re acutely aware of the dangers, and we’ve already taken measures to mitigate the most serious risks.
Here’s some of what we’re planning to do on the somewhat longer term.
Isolate Model Training and Inference.
By running our own bare metal servers (using the excellent Scaleway, solidly based within the EU), we eliminate reliance on third-party APIs that could be compromised or poisoned. This gives us full control over the model’s environment, data, and updates. We’ll use self-hosted solutions to store and version our models.
Secure the Model Supply Chain
We’ll implement automated pipelines to validate models before deployment. Use tools like MLflow to track model lineage, training data, and performance metrics. Use TrojanZoo or custom scripts to detect unusual behaviour or hidden layers.
Harden the Infrastructure
With private VLANs or VPNs for internal access, we’ll expose only necessary APIs through a reverse proxy and with strict rate limiting and authentication.
Automate Security Updates and Patching
We can use our existing workflow to automate security updates for model dependencies. For example, trigger model retraining or rollback if a CVE is detected in a dependency. If a model behaves unexpectedly, we can automatically revert to a known-good version. In our staging environment, we’ll validate model updates. Deploy fixes only after automated tests and human review.
These measures tie in with our policy to use AI coding agents to superpower our developers while keeping costs down. Enabling us to work with our clients to make their businesses more productive and competitive.
The internet was once imagined as a bright future but reality has hit from unexpected directions. Same with AI.
Once upon a time, we all were enthused by this new phenomenon, the internet. We launched blogs and did weird experiments in javascript and gif animations.
But the first cyberattack had then already happened.
In 1988, before the internet proper had seen light of day, a graduate student created a program to measure the size of the internet but unintentionally exploited vulnerabilities in Unix systems, slowing down or crashing thousands of computers.
And thus the field of cybersecurity was born.
The internet has turned out simultaneously a great good and a great threat. No surprise then that AI is the same.
But with targeted response we can do something about the threats surrounding us. Like the real world, if we’re aware of the dangers lurking below the waters, we can take measures to mitigate.
The internet is here to stay. AI is here to stay. With all its good, and all its bad. Face up to it, and build defences.
Header photo: Trust "Tru" Katsande on Unsplash.