Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips
Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.
Rigorous security comparison of AI code generators, highly technical and actionable.
Across four security domains, Gemini 2.5 Flash and Claude Sonnet 4.6 produced nearly identical results: both leaked password hashes from MongoDB queries without projection, both skipped audience validation on JWT tokens, and neither model added audit logging or input rate-limiting. Among 700 AI-generated functions tested against CWE-mapped ESLint plugins, 63% shipped a vulnerability, underscoring that model choice matters far less than the hardening gaps both omit.
- Add CWE-mapped ESLint security plugins to your CI pipeline; the 63% miss rate means that even the best current models cannot be trusted to emit production-ready security controls from feature prompts alone.
For platform engineers shipping AI-generated auth or data-access code, this confirms that standard prompt-based outputs routinely skip non-functional security guardrails, and standard code review fails to catch them because the missing patterns—audience, projection, rate limits—are invisible unless explicitly linted.