Last Week in AI
Curated summaries of the most important AI security developments, delivered every Friday.
Last Week in AI Security — Week of March 30, 2026
Anthropic's Claude Code codebase accidentally exposed on npm in packaging error; China-linked attackers exploit Claude for cyberattacks; Unit 42 fuzzing research reveals LLM guardrail fragility at scale.
- Anthropic ships entire Claude Code source (500K lines) to npm in misconfigured debug bundle
- China-linked hackers exploit Claude and DeepSeek in Mexican government attack stealing tax data
- Unit 42 genetic prompt fuzzing reveals LLMs remain vulnerable despite years of safety work
Last Week in AI Security — Week of March 23, 2026
Zenity demonstrates zero-click prompt injection exploits at RSA 2026; Cisco releases DefenseClaw open-source agent security framework; Meta AI agent autonomously exposes data in severe breach incident.
- Zenity's 'Your AI Agents Are My Minions' demo shows zero-click prompt injection chains at RSA 2026
- Cisco releases DefenseClaw, open-source secure agent framework with NVIDIA OpenShell integration
- Meta confirms internal AI agent autonomously exposed proprietary code during two-hour Sev 1 incident
Last Week in AI Security — Week of March 16, 2026
vLLM remote code execution via video link (CVE-2026-22778), Palo Alto Networks reveals prompt fuzzing weaknesses across models, and HiddenLayer reports 1 in 8 companies hit by agentic AI breaches.
- Critical vLLM RCE vulnerability (CVE-2026-22778) allows remote code execution via malicious video URL
- Palo Alto Unit 42 prompt fuzzing study shows evasion rates up to 96.65% against existing guardrails
- HiddenLayer's 2026 Threat Report: 1 in 8 companies report AI breaches linked to agentic systems
Last Week in AI Security — Week of March 9, 2026
OpenAI acquires Promptfoo for AI red-teaming; Pentagon labels Anthropic a supply-chain risk in heated AI ethics dispute; Chrome Gemini panel CVE and EU AI Act enforcement rules published.
- OpenAI acquires Promptfoo to integrate AI security testing into Frontier platform
- Pentagon brands Anthropic 'supply chain risk' after refusing autonomous weapons, mass surveillance use
- Google patches CVE-2026-0628 in Chrome Gemini AI panel; prompt injection attack surface expands
Last Week in AI Security — Week of March 2, 2026
OpenAI launches Codex Security agent; Palo Alto warns AI agents are 2026's top insider threat; vLLM RCE and LangChain serialization vulnerabilities disclosed.
- OpenAI launches Codex Security agent finding 10,561 high-severity vulnerabilities across 1.2M commits
- Palo Alto Networks: AI agents represent new insider threat, with 40% enterprise app integration by 2026
- MITRE ATLAS publishes first 2026 update with Zenity contributions on agentic AI attack techniques
Last Week in AI Security — Week of February 23, 2026
Defense Secretary declares Anthropic a supply chain risk; CrowdStrike report shows AI-enabled breakout time down to 29 minutes; vLLM RCE vulnerability exposed.
- Pentagon designates Anthropic as supply chain risk, bans military contractors from using Claude
- CrowdStrike: AI-accelerated breakout time plummets to 29 minutes, down from 48 minutes in 2024
- Critical vLLM RCE vulnerability (CVE-2026-22778) enables takeover via malicious video links
Last Week in AI Security — Week of February 16, 2026
International AI Safety Report 2026 published; AI-assisted threat actor compromised 600+ FortiGate devices; Google Translate Gemini prompt injection discovered.
- AI-powered attack compromised 600+ FortiGate firewalls across 55 countries
- International AI Safety Report 2026 released by 100+ experts from 30+ nations
- Google Translate Gemini mode exploited via prompt injection vulnerability
Last Week in AI Security — Week of February 9, 2026
International AI Safety Report reveals escalating risks while critical prompt injection vulnerabilities emerge across major AI platforms.
- International AI Safety Report 2026 documents real-world AI security threats across deepfakes and cyberattacks
- NIST releases preliminary Cybersecurity Framework Profile for AI with three-tier priority system
- OpenAI and Microsoft disclose prompt injection vulnerabilities in ChatGPT Atlas and Copilot memory
Last Week in AI Security — Week of February 2, 2026
International AI Safety Report 2026 released by UK's AISI highlighting deepfake and cyberattack risks while ChatGPT wrapper app exposed 300M messages through Firebase misconfiguration
- International AI Safety Report 2026 warns of deepfake surge to 20% of fraud attempts
- Chat & Ask AI exposed 300M messages from 25M users via Firebase misconfiguration
- NIST releases Cyber AI Profile preliminary draft extending CSF 2.0 for AI systems
Last Week in AI Security — Week of January 26, 2026
This week: NIST releases AI Red Teaming guidelines, critical vulnerability in popular inference framework, and new research on multi-modal jailbreaks.
- NIST publishes formal AI Red Teaming framework (AI 600-1 companion)
- CVE-2026-2847: Remote code execution in vLLM serving endpoint
- University of Toronto paper demonstrates cross-modal injection attacks