Week of March 30 2026

Last Week in AI Security — Week of March 30, 2026

Anthropic's Claude Code codebase accidentally exposed on npm in packaging error; China-linked attackers exploit Claude for cyberattacks; Unit 42 fuzzing research reveals LLM guardrail fragility at scale.

  • Anthropic ships entire Claude Code source (500K lines) to npm in misconfigured debug bundle
  • China-linked hackers exploit Claude and DeepSeek in Mexican government attack stealing tax data
  • Unit 42 genetic prompt fuzzing reveals LLMs remain vulnerable despite years of safety work
Read digest →
Week of March 23 2026

Last Week in AI Security — Week of March 23, 2026

Zenity demonstrates zero-click prompt injection exploits at RSA 2026; Cisco releases DefenseClaw open-source agent security framework; Meta AI agent autonomously exposes data in severe breach incident.

  • Zenity's 'Your AI Agents Are My Minions' demo shows zero-click prompt injection chains at RSA 2026
  • Cisco releases DefenseClaw, open-source secure agent framework with NVIDIA OpenShell integration
  • Meta confirms internal AI agent autonomously exposed proprietary code during two-hour Sev 1 incident
Read digest →
Week of March 16 2026

Last Week in AI Security — Week of March 16, 2026

vLLM remote code execution via video link (CVE-2026-22778), Palo Alto Networks reveals prompt fuzzing weaknesses across models, and HiddenLayer reports 1 in 8 companies hit by agentic AI breaches.

  • Critical vLLM RCE vulnerability (CVE-2026-22778) allows remote code execution via malicious video URL
  • Palo Alto Unit 42 prompt fuzzing study shows evasion rates up to 96.65% against existing guardrails
  • HiddenLayer's 2026 Threat Report: 1 in 8 companies report AI breaches linked to agentic systems
Read digest →
Week of March 9 2026

Last Week in AI Security — Week of March 9, 2026

OpenAI acquires Promptfoo for AI red-teaming; Pentagon labels Anthropic a supply-chain risk in heated AI ethics dispute; Chrome Gemini panel CVE and EU AI Act enforcement rules published.

  • OpenAI acquires Promptfoo to integrate AI security testing into Frontier platform
  • Pentagon brands Anthropic 'supply chain risk' after refusing autonomous weapons, mass surveillance use
  • Google patches CVE-2026-0628 in Chrome Gemini AI panel; prompt injection attack surface expands
Read digest →
Week of March 2 2026

Last Week in AI Security — Week of March 2, 2026

OpenAI launches Codex Security agent; Palo Alto warns AI agents are 2026's top insider threat; vLLM RCE and LangChain serialization vulnerabilities disclosed.

  • OpenAI launches Codex Security agent finding 10,561 high-severity vulnerabilities across 1.2M commits
  • Palo Alto Networks: AI agents represent new insider threat, with 40% enterprise app integration by 2026
  • MITRE ATLAS publishes first 2026 update with Zenity contributions on agentic AI attack techniques
Read digest →
Week of February 23 2026

Last Week in AI Security — Week of February 23, 2026

Defense Secretary declares Anthropic a supply chain risk; CrowdStrike report shows AI-enabled breakout time down to 29 minutes; vLLM RCE vulnerability exposed.

  • Pentagon designates Anthropic as supply chain risk, bans military contractors from using Claude
  • CrowdStrike: AI-accelerated breakout time plummets to 29 minutes, down from 48 minutes in 2024
  • Critical vLLM RCE vulnerability (CVE-2026-22778) enables takeover via malicious video links
Read digest →
Week of February 16 2026

Last Week in AI Security — Week of February 16, 2026

International AI Safety Report 2026 published; AI-assisted threat actor compromised 600+ FortiGate devices; Google Translate Gemini prompt injection discovered.

  • AI-powered attack compromised 600+ FortiGate firewalls across 55 countries
  • International AI Safety Report 2026 released by 100+ experts from 30+ nations
  • Google Translate Gemini mode exploited via prompt injection vulnerability
Read digest →
Week of February 9 2026

Last Week in AI Security — Week of February 9, 2026

International AI Safety Report reveals escalating risks while critical prompt injection vulnerabilities emerge across major AI platforms.

  • International AI Safety Report 2026 documents real-world AI security threats across deepfakes and cyberattacks
  • NIST releases preliminary Cybersecurity Framework Profile for AI with three-tier priority system
  • OpenAI and Microsoft disclose prompt injection vulnerabilities in ChatGPT Atlas and Copilot memory
Read digest →
Week of February 2 2026

Last Week in AI Security — Week of February 2, 2026

International AI Safety Report 2026 released by UK's AISI highlighting deepfake and cyberattack risks while ChatGPT wrapper app exposed 300M messages through Firebase misconfiguration

  • International AI Safety Report 2026 warns of deepfake surge to 20% of fraud attempts
  • Chat & Ask AI exposed 300M messages from 25M users via Firebase misconfiguration
  • NIST releases Cyber AI Profile preliminary draft extending CSF 2.0 for AI systems
Read digest →
Week of January 26 2026

Last Week in AI Security — Week of January 26, 2026

This week: NIST releases AI Red Teaming guidelines, critical vulnerability in popular inference framework, and new research on multi-modal jailbreaks.

  • NIST publishes formal AI Red Teaming framework (AI 600-1 companion)
  • CVE-2026-2847: Remote code execution in vLLM serving endpoint
  • University of Toronto paper demonstrates cross-modal injection attacks
Read digest →