What this site is for
AI Attacks covers offensive AI security from a working practitioner's perspective. Here's what we publish.
AI Attacks exists to cover offensive AI security with the same rigor a working AI red teamer would expect — and the same honesty about what does and doesn’t land in production.
What we publish:
Technical writeups of working attacks. Prompt injection variants, jailbreak techniques and the model behaviors they exploit, indirect injection through retrieved content, multi-modal attack chains, agent and tool-use abuse. Where possible, reproducible PoCs against open models. Closed models get attack patterns and behavioral analysis.
Adversarial ML, applied. Membership inference, model extraction, evasion attacks, training-data extraction, backdoors — focused on what’s exploitable in deployed systems, not theoretical bounds.
Red team methodology. Scoping AI engagements, building attack libraries, communicating findings to a model team that doesn’t speak security and a security team that doesn’t speak ML.
Tooling reviews. Honest takes on the offensive AI security tooling landscape — Garak, PyRIT, promptmap, the LLM-specific scanners — and what each is actually good for.
What we don’t publish:
- Press release rewrites
- Listicles
- Anything we can’t source to primary material
Bylines are pseudonymous. The work is the point. Tips, attack reports, and corrections to the editor.
Real content starts shortly.
Subscribe
Practitioner-grade AI red team techniques and tooling. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.