LLM Security Leaderboard
Which AI model is the safest? We tested 15+ LLMs with 230+ attack techniques across 15 categories. See security scores, grades, and vulnerability breakdowns.
Our Testing Methodology
How we evaluate the security of every model — consistently and transparently
AI-to-AI Testing
Our AI agents conduct multi-turn adversarial conversations with each model, simulating real attack scenarios.
230+ Attack Techniques
Every model faces the same comprehensive suite of jailbreaks, prompt injections, evasion, and exfiltration attacks.
Consistent Grading
Security scores (0-100) and letter grades (A+ to F) are calculated using our standardized rubric across all models.
Multilingual Attacks
We test in 12 languages to ensure safety guardrails work across language boundaries and mixed-script attacks.
False Positive Filtering
An LLM judge verifies every finding, eliminating false positives so only real vulnerabilities count toward the score.
Regular Re-Testing
Models are re-scanned as providers update them, ensuring the leaderboard always reflects current security posture.
Test Your AI Model
Run the same 230+ attack techniques against your own LLM deployment. Get a security score and detailed vulnerability report in minutes.
Start Free Scan