Anthropic publishes safety research and owns that conversation. We do the same for offensive AI security. Quarterly papers, real findings, fully reproducible methodology — submitted to USENIX, BlackHat, DEFCON, and IEEE S&P.
Multi-Turn Authority Escalation against GPT-5 Enterprise
A formal study of the 5-layer authority escalation pattern (curious learner → researcher → official → regulator → emergency) against frontier enterprise models. Includes ASR breakdown per layer and per attack category.
How we built a living knowledge graph of attack-to-model edges that learns within a single scan and across the customer base. Includes the mid-scan learning loop architecture.
Outline — full draft TBA
Publication cadence
Quarterly
One technical paper to a top-tier venue (USENIX, BlackHat, DEFCON, IEEE S&P) covering a real finding from the corpus.
Monthly
Short technical posts on the blog summarizing notable jailbreaks added to the harvester.
Continuous
The Living Leaderboard at shieldpi.info updates weekly with delta from the harvester + customer scans.
All findings reproducible from our public methodology endpoint.