AI-Generated Code Security: The 2026 Risk Picture

Your Developers Are Using AI to Write Code. Here Is the Security Cost

A developer on your team is under deadline pressure. Rather than write a new database query from scratch, they use an AI coding assistant to generate it. The code looks correct. It passes the automated linter. It ships to production on a Friday afternoon.

Three months later, your organization discovers that query contained a SQL injection vulnerability. An attacker had already found it. Customer data was accessed. Incident response begins. Under the EU’s NIS2 Directive, your security team now has 24 hours to file an early warning with your national authority, and 72 hours for a full incident report.

This is not a hypothetical edge case. It is the scenario that cybersecurity researchers and regulators are increasingly treating as a near-certainty as AI coding tools become standard practice across software development teams.

This article explains what the research shows about AI-generated code security, why the problem is structural rather than fixable by simply choosing a better AI tool, what it means for your obligations under EU regulation and what your organization needs to do differently.

AI-Generated Code Security: What the Research Actually Shows

Independent research testing more than 100 large language models across 80 coding tasks in four programming languages Java, Python, JavaScript and C# produced a striking headline finding 45% of AI-generated code contains security flaws.

What makes this finding significant is not just the number it is the consistency. Security performance has remained essentially unchanged over time, even as the same AI models dramatically improved at generating syntactically correct, functionally working code. Newer and larger models do not produce significantly more secure code than their predecessors.

The picture becomes sharper when broken down by programming language. Java the backbone of many enterprise and public-sector applications produced a security pass rate of only 29%, meaning more than 7 in 10 AI-generated Java code samples contained security flaws. Python reached a 62% pass rate. JavaScript 57%. C# 55%.

Certain vulnerability types were particularly persistent. Cross-Site Scripting (CWE-80) had a failure rate of 86%. Log Injection (CWE-117) failed 88% of the time. Both are among the OWASP Top 10 the most widely used reference list of critical application security risks.

The broader vulnerability landscape reinforces the urgency. The number of new CVEs officially registered security vulnerabilities grew by 336% between 2016 and 2023 and then surged a further 39% between 2024 and 2025 alone, a single-year record. AI-assisted development is expected to accelerate this trajectory in the years ahead.

Why AI Code Is Structurally Insecure: Not Just Occasionally Buggy

The instinct is to assume this is a temporary problem that AI models will get better at writing secure code as they improve. The research suggests otherwise and understanding why explains the nature of the challenge.

AI coding tools do not understand security. They predict what code is statistically likely to appear next, based on patterns learned from vast repositories of existing code. Much of that training data contains vulnerabilities because the public code repositories AI learns from contain both secure and insecure implementations and the model learns that both are acceptable solutions.

Security researchers identify three structural reasons why AI models struggle regardless of their overall capability level.

Training Data Contamination

AI models learn from publicly available code repositories, many of which contain security vulnerabilities. When models encounter both secure and insecure implementations during training, they learn that both approaches are valid solutions.

Lack of Security Context

AI tools generate code without deep understanding of the application’s security requirements, business logic or system architecture. This context gap leads to code that works functionally but lacks appropriate security controls.

Limited Semantic Understanding

Determining whether variables contain user-controlled data requires sophisticated intraprocedural analysis. Current AI models cannot perform the complex dataflow analysis needed to make accurate security decisions.

“AI coding tools could prioritise speed over security. New categories of AI and LLM vulnerabilities demand a wider range of testing skills.”

– Offensive security research, 2026

Development Pipeline: Traditional vs AI-Assisted

Where the security gaps typically appear in an AI-assisted development workflow.

Stage	Traditional Pipeline	AI-Assisted Pipeline
01. Code writing	Developer writes code manually	AI generates code from a prompt in seconds
02. Code review	Code reviewed by a peer developer	Developer reviews for functional correctness only ⚠
03. Static analysis	Static analysis scan runs	Same scan, not calibrated for AI output patterns ⚠
04. Security review	Security team reviews high-risk changes	Security review skipped, volume too high, speed too fast ⚠
05. Pen testing	Penetration test before major release	Pentest cadence unchanged, annual or per major release ⚠
06. Deployment	Production deployment	Production deployment, faster with more unreviewed code

If Your Code Contains AI-Generated Vulnerabilities, You May Already Have a Compliance Problem

For many organizations, this is no longer purely a technical risk it is a regulatory one. Two pieces of EU legislation create direct obligations that AI-generated code vulnerabilities can breach.

The Cyber Resilience Act (CRA)

The CRA entered into force on 10 December 2024 (Regulation (EU) 2024/2847, published in the Official Journal on 20 November 2024; entered into force 20 days later per Article 71). Its main obligations apply from 11 December 2027 with product vulnerability reporting requirements beginning from 11 September 2026.

The CRA applies to manufacturers, importers and distributors of products with digital elements software, hardware and connected services sold or made available in the EU market.

The regulation prescribes that products must arrive on the market free from known exploitable vulnerabilities. It explicitly requires coordinated vulnerability disclosure as a compliance mechanism meaning organizations must have active processes for finding and remediating vulnerabilities before products reach customers.

An organization shipping software that contains AI-generated code with known vulnerability patterns SQL injection, cross-site scripting, insecure cryptographic implementations without adequate security testing before release is directly exposed under this requirement. The fines under CRA reach €15 million or 2.5% of global annual turnover.

The NIS2 Directive

NIS2 is now in force across the EU. It applies to providers of essential and important services across sectors including energy, transport, banking, health, digital infrastructure and public administration.

The directive demands a proactive, risk-based approach to understanding attack surfaces, managing supply chain risks and vulnerability management. Security testing is specifically endorsed in NIS2 guidelines as producing strong results for most organizations. Non-compliance penalties reach €10 million or 2% of annual worldwide turnover with potential personal liability for board-level management.

The Connection Most Organizations are Missing

Both regulations were designed for a threat landscape where organizations know what is in their systems and can demonstrate active vulnerability management. AI-assisted development which ships code faster, in larger volumes with a 45% flaw rate creates a compliance risk that is invisible until it becomes an incident. The time to close that gap is before the breach, not after the regulatory notification deadline.

The Vibe Coding Problem: When Non-Developers Write Production Code

The security risk from AI-generated code is not limited to professional developers. A growing phenomenon referred to in application security research as “vibe coding” describes non-technical employees using AI tools to build internal automations, data pipelines, scripts and web-facing tools without any developer involvement at all.

“Vibe coding” describes a shift where non-technical users build functional applications through natural language prompts, bypassing traditional development processes entirely. An operations manager building an internal reporting tool with an AI assistant, a marketing team generating API integrations, an administrator scripting a data migration all of this creates code that may interact with sensitive data, customer records or regulated systems and none of it goes through a development pipeline with any security review.

Security research shows that AI lowers the barrier to entry across skill levels with 24% of professional security researchers reporting that AI helps them work in technologies previously outside their expertise. The same lowered barrier that benefits skilled practitioners also enables non-security people to generate code that interacts with systems they do not fully understand.

For security and compliance teams, this creates an almost invisible attack surface, code that was never reviewed, never tested and in many cases, never formally recognized as code that needed to be governed.

Step	Action	Why it matters	Who owns it	Source
MAP	Inventory all AI-generated code across your organisation	You cannot test what you cannot see. Include internal tools and automations built outside the IT department.	IT / Security Lead	Security industry best practice¹²
TEST	Move to continuous security testing not annual pentests	Scanners and point-in-time pentests “fall well short of the testing depth or coverage required today.	Security Team / MSSP	Offensive security research²
FIX	Prioritise fixes by business risk, not just severity score	Patching strategies “must fundamentally change” to account for real business context, not just technical severity scores	Security + Dev Teams	Gartner 2026 Planning Guide³
COMPLY	Document AI coding tool use in your compliance programme	NIS2 and CRA require demonstrable vulnerability management processes. Absence of documentation is itself a compliance gap.	Compliance / CISO	CRA Art. 13´ · NIS2 Art. 21´
GOVERN	Establish a policy for non-developer AI code generation	Systems handling personal data or customer records require security review regardless of how the code was produced.	IT Policy / Legal	Application security research¹

What Your Organization Should Do Differently

The answer is not to stop using AI coding tools. The answer is to close the security gaps they introduce through a structured approach. The framework below reflects established security operations best practice Map, Test, Fix and Comply applied directly to AI-generated code.

Why Automated Scanning Alone Is Not the Answer

A common response to the AI code security problem is to add more automated scanning. This is necessary but it is not sufficient and security research supports this clearly.

Automated scanners check for patterns they know about. They are fast and scalable and they catch a meaningful proportion of known vulnerability types. But they have structural limitations that are especially relevant in the context of AI-generated code.

First, automated tools cannot assess business context. The same vulnerability has very different risk profiles depending on what data it exposes and what controls exist around it. A scanner flags it a security professional assesses whether it is exploitable and what the real business impact would be.

Second, automated tools miss complex, chained vulnerabilities. Individual components may each pass scanning. The combination of two medium-severity issues one in AI-generated authentication logic, one in AI-generated session handling may create a critical exploit that neither scanner individually detects.

Third, AI security tooling itself introduces noise. Research across a community of 245 active security researchers found that 50% cited false positives as their top concern with AI-assisted security tools and 48% were concerned about hallucinated vulnerability findings. Automation that generates noise has a direct cost: it consumes analyst time that should be spent on genuine risks.

“It is important to keep the human brain involved in security analysis to ensure the impact reflects the context both the analyst’s knowledge and the organization’s specific environment.”

– Security research on human-in-the-loop security operations, 2026

The principle is straightforward: automated tools handle volume and speed. Human-led security testing handles context, creativity and the kind of adversarial thinking that finds vulnerabilities automated tools were not designed to look for.

The organizations that will navigate this period with the lowest risk exposure are those that treat human security testing not as a periodic compliance exercise but as a continuous operational function adapted to the speed at which AI is changing their attack surface.

Conclusion

AI-generated code security is not a future problem it is a present one. With 45% of AI-generated code containing security flaws and CVEs growing 39% in a single year, the gap between how fast code is being shipped and how thoroughly it is being tested is widening in real time.

For organizations operating under NIS2 or preparing for CRA obligations, that gap is no longer just a technical risk. It is a compliance and legal liability.

The answer is not to stop using AI coding tools. It is to match the speed of AI-assisted development with continuous, human-led security testing and to govern the code your entire organisation is generating, not just the code your developers write.

AI moves fast. Your security posture needs to keep pace.

References

Veracode. GenAI Code Security Report: Assessing the Security of Using LLMs for Coding. Veracode.com, 2025. Study of 100+ LLMs across 80 coding tasks in Java, Python, JavaScript, and C#.

Not Sure
Where to Start?

Is Your Domain Already Compromised?

AI-Generated Code Security: The 2026 Risk Picture

Your Developers Are Using AI to Write Code. Here Is the Security Cost

AI-Generated Code Security: What the Research Actually Shows