Your Developers Are Using AI to Write Code. Here Is the Security Cost
A developer on your team is under deadline pressure. Rather than write a new database query from scratch, they use an AI coding assistant to generate it. The code looks correct. It passes the automated linter. It ships to production on a Friday afternoon.
Three months later, your organization discovers that query contained a SQL injection vulnerability. An attacker had already found it. Customer data was accessed. Incident response begins. Under the EU’s NIS2 Directive, your security team now has 24 hours to file an early warning with your national authority, and 72 hours for a full incident report.
This is not a hypothetical edge case. It is the scenario that cybersecurity researchers and regulators are increasingly treating as a near-certainty as AI coding tools become standard practice across software development teams.
This article explains what the research shows about AI-generated code security, why the problem is structural rather than fixable by simply choosing a better AI tool, what it means for your obligations under EU regulation and what your organization needs to do differently.
AI-Generated Code Security: What the Research Actually Shows
Independent research testing more than 100 large language models across 80 coding tasks in four programming languages Java, Python, JavaScript and C# produced a striking headline finding 45% of AI-generated code contains security flaws.
What makes this finding significant is not just the number it is the consistency. Security performance has remained essentially unchanged over time, even as the same AI models dramatically improved at generating syntactically correct, functionally working code. Newer and larger models do not produce significantly more secure code than their predecessors.
The picture becomes sharper when broken down by programming language. Java the backbone of many enterprise and public-sector applications produced a security pass rate of only 29%, meaning more than 7 in 10 AI-generated Java code samples contained security flaws. Python reached a 62% pass rate. JavaScript 57%. C# 55%.
Certain vulnerability types were particularly persistent. Cross-Site Scripting (CWE-80) had a failure rate of 86%. Log Injection (CWE-117) failed 88% of the time. Both are among the OWASP Top 10 the most widely used reference list of critical application security risks.
The broader vulnerability landscape reinforces the urgency. The number of new CVEs officially registered security vulnerabilities grew by 336% between 2016 and 2023 and then surged a further 39% between 2024 and 2025 alone, a single-year record. AI-assisted development is expected to accelerate this trajectory in the years ahead.

Why AI Code Is Structurally Insecure: Not Just Occasionally Buggy
The instinct is to assume this is a temporary problem that AI models will get better at writing secure code as they improve. The research suggests otherwise and understanding why explains the nature of the challenge.
AI coding tools do not understand security. They predict what code is statistically likely to appear next, based on patterns learned from vast repositories of existing code. Much of that training data contains vulnerabilities because the public code repositories AI learns from contain both secure and insecure implementations and the model learns that both are acceptable solutions.
Security researchers identify three structural reasons why AI models struggle regardless of their overall capability level.
Training Data Contamination
AI models learn from publicly available code repositories, many of which contain security vulnerabilities. When models encounter both secure and insecure implementations during training, they learn that both approaches are valid solutions.
Lack of Security Context
AI tools generate code without deep understanding of the application’s security requirements, business logic or system architecture. This context gap leads to code that works functionally but lacks appropriate security controls.
Limited Semantic Understanding
Determining whether variables contain user-controlled data requires sophisticated intraprocedural analysis. Current AI models cannot perform the complex dataflow analysis needed to make accurate security decisions.
“AI coding tools could prioritise speed over security. New categories of AI and LLM vulnerabilities demand a wider range of testing skills.”
– Offensive security research, 2026
Development Pipeline: Traditional vs AI-Assisted
Where the security gaps typically appear in an AI-assisted development workflow.
| Stage | Traditional Pipeline | AI-Assisted Pipeline |
| 01. Code writing | Developer writes code manually | AI generates code from a prompt in seconds |
| 02. Code review | Code reviewed by a peer developer | Developer reviews for functional correctness only ⚠ |
| 03. Static analysis | Static analysis scan runs | Same scan, not calibrated for AI output patterns ⚠ |
| 04. Security review | Security team reviews high-risk changes | Security review skipped, volume too high, speed too fast ⚠ |
| 05. Pen testing | Penetration test before major release | Pentest cadence unchanged, annual or per major release ⚠ |
| 06. Deployment | Production deployment | Production deployment, faster with more unreviewed code |
If Your Code Contains AI-Generated Vulnerabilities, You May Already Have a Compliance Problem
For many organizations, this is no longer purely a technical risk it is a regulatory one. Two pieces of EU legislation create direct obligations that AI-generated code vulnerabilities can breach.
The Cyber Resilience Act (CRA)
The CRA entered into force on 10 December 2024 (Regulation (EU) 2024/2847, published in the Official Journal on 20 November 2024; entered into force 20 days later per Article 71). Its main obligations apply from 11 December 2027 with product vulnerability reporting requirements beginning from 11 September 2026.
The CRA applies to manufacturers, importers and distributors of products with digital elements software, hardware and connected services sold or made available in the EU market.
The regulation prescribes that products must arrive on the market free from known exploitable vulnerabilities. It explicitly requires coordinated vulnerability disclosure as a compliance mechanism meaning organizations must have active processes for finding and remediating vulnerabilities before products reach customers.
An organization shipping software that contains AI-generated code with known vulnerability patterns SQL injection, cross-site scripting, insecure cryptographic implementations without adequate security testing before release is directly exposed under this requirement. The fines under CRA reach €15 million or 2.5% of global annual turnover.
The NIS2 Directive
NIS2 is now in force across the EU. It applies to providers of essential and important services across sectors including energy, transport, banking, health, digital infrastructure and public administration.
The directive demands a proactive, risk-based approach to understanding attack surfaces, managing supply chain risks and vulnerability management. Security testing is specifically endorsed in NIS2 guidelines as producing strong results for most organizations. Non-compliance penalties reach €10 million or 2% of annual worldwide turnover with potential personal liability for board-level management.
The Connection Most Organizations are Missing
Both regulations were designed for a threat landscape where organizations know what is in their systems and can demonstrate active vulnerability management. AI-assisted development which ships code faster, in larger volumes with a 45% flaw rate creates a compliance risk that is invisible until it becomes an incident. The time to close that gap is before the breach, not after the regulatory notification deadline.
The Vibe Coding Problem: When Non-Developers Write Production Code
The security risk from AI-generated code is not limited to professional developers. A growing phenomenon referred to in application security research as “vibe coding” describes non-technical employees using AI tools to build internal automations, data pipelines, scripts and web-facing tools without any developer involvement at all.
“Vibe coding” describes a shift where non-technical users build functional applications through natural language prompts, bypassing traditional development processes entirely. An operations manager building an internal reporting tool with an AI assistant, a marketing team generating API integrations, an administrator scripting a data migration all of this creates code that may interact with sensitive data, customer records or regulated systems and none of it goes through a development pipeline with any security review.
Security research shows that AI lowers the barrier to entry across skill levels with 24% of professional security researchers reporting that AI helps them work in technologies previously outside their expertise. The same lowered barrier that benefits skilled practitioners also enables non-security people to generate code that interacts with systems they do not fully understand.
For security and compliance teams, this creates an almost invisible attack surface, code that was never reviewed, never tested and in many cases, never formally recognized as code that needed to be governed.
| Step | Action | Why it matters | Who owns it | Source |
| MAP | Inventory all AI-generated code across your organisation | You cannot test what you cannot see. Include internal tools and automations built outside the IT department. | IT / Security Lead | Security industry best practice¹² |
| TEST | Move to continuous security testing not annual pentests | Scanners and point-in-time pentests “fall well short of the testing depth or coverage required today. | Security Team / MSSP | Offensive security research² |
| FIX | Prioritise fixes by business risk, not just severity score | Patching strategies “must fundamentally change” to account for real business context, not just technical severity scores | Security + Dev Teams | Gartner 2026 Planning Guide³ |
| COMPLY | Document AI coding tool use in your compliance programme | NIS2 and CRA require demonstrable vulnerability management processes. Absence of documentation is itself a compliance gap. | Compliance / CISO | CRA Art. 13´ · NIS2 Art. 21´ |
| GOVERN | Establish a policy for non-developer AI code generation | Systems handling personal data or customer records require security review regardless of how the code was produced. | IT Policy / Legal | Application security research¹ |
What Your Organization Should Do Differently
The answer is not to stop using AI coding tools. The answer is to close the security gaps they introduce through a structured approach. The framework below reflects established security operations best practice Map, Test, Fix and Comply applied directly to AI-generated code.
Why Automated Scanning Alone Is Not the Answer
A common response to the AI code security problem is to add more automated scanning. This is necessary but it is not sufficient and security research supports this clearly.
Automated scanners check for patterns they know about. They are fast and scalable and they catch a meaningful proportion of known vulnerability types. But they have structural limitations that are especially relevant in the context of AI-generated code.
First, automated tools cannot assess business context. The same vulnerability has very different risk profiles depending on what data it exposes and what controls exist around it. A scanner flags it a security professional assesses whether it is exploitable and what the real business impact would be.
Second, automated tools miss complex, chained vulnerabilities. Individual components may each pass scanning. The combination of two medium-severity issues one in AI-generated authentication logic, one in AI-generated session handling may create a critical exploit that neither scanner individually detects.
Third, AI security tooling itself introduces noise. Research across a community of 245 active security researchers found that 50% cited false positives as their top concern with AI-assisted security tools and 48% were concerned about hallucinated vulnerability findings. Automation that generates noise has a direct cost: it consumes analyst time that should be spent on genuine risks.
“It is important to keep the human brain involved in security analysis to ensure the impact reflects the context both the analyst’s knowledge and the organization’s specific environment.”
– Security research on human-in-the-loop security operations, 2026
The principle is straightforward: automated tools handle volume and speed. Human-led security testing handles context, creativity and the kind of adversarial thinking that finds vulnerabilities automated tools were not designed to look for.
The organizations that will navigate this period with the lowest risk exposure are those that treat human security testing not as a periodic compliance exercise but as a continuous operational function adapted to the speed at which AI is changing their attack surface.
Conclusion
AI-generated code security is not a future problem it is a present one. With 45% of AI-generated code containing security flaws and CVEs growing 39% in a single year, the gap between how fast code is being shipped and how thoroughly it is being tested is widening in real time.
For organizations operating under NIS2 or preparing for CRA obligations, that gap is no longer just a technical risk. It is a compliance and legal liability.
The answer is not to stop using AI coding tools. It is to match the speed of AI-assisted development with continuous, human-led security testing and to govern the code your entire organisation is generating, not just the code your developers write.
AI moves fast. Your security posture needs to keep pace.
References
- YesWeHack. YesWeHack Report 2026: The Trends, Insights and Strategic Shifts Shaping Offensive Security. YesWeHack, 2026. Includes CVE growth data (2016–2025) and a survey of 245 security researchers on AI tool adoption
- Regulation (EU) 2024/2847 (Cyber Resilience Act), Official Journal of the European Union, 20 November 2024. Entered into force 10 December 2024. Reporting obligations from 11 September 2026. Full application from 11 December 2027. Directive (EU) 2022/2555 (NIS2 Directive), Article 21: risk management measures.