Last Week in AI Security — Week of April 27, 2026

Executive Summary

The week of April 27, 2026, underscores a critical inflection point: AI security threats are no longer theoretical—they are operationalized, measured, and increasing. Google researchers observed a 32% increase in malicious prompt injection attempts between November 2025 and February 2026, confirming that adversaries are actively probing production AI systems at scale. While sophistication remains relatively low and researchers did not observe significant amounts of advanced attacks using known exfiltration prompts published by security researchers in 2025, the trajectory is clear: both scale and sophistication are expected to escalate rapidly.

On the policy and procurement front, the Pentagon formalized its AI strategy by announcing agreements with seven major technology companies—SpaceX, OpenAI, Google, Microsoft, Nvidia, Amazon Web Services, and Reflection—to deploy their AI tools on classified networks. Conspicuously absent from this roster is Anthropic, which the Trump administration has blacklisted over the company’s insistence that the Pentagon include certain safety guardrails for the government’s use of AI in warfare. The stand-off has intensified after Anthropic’s Mythos model demonstrated significant offensive cyber capabilities, creating a paradox where national security imperatives collide with safety frameworks.

Meanwhile, foundational infrastructure vulnerabilities continue to expose the AI stack. CVE-2026-31431, a Linux local privilege escalation flaw with a CVSS score of 7.8, allows an unprivileged local user to write four controlled bytes into the page cache of any readable file and use that to gain root. The vulnerability affects essentially all Linux distributions shipped since August 2017, demonstrating that the operating systems underpinning AI infrastructure harbor decade-old flaws now entering active exploitation. Security practitioners face a dual challenge: defend against novel AI-specific attacks while remediating legacy vulnerabilities in the substrate supporting ML workloads.

Top Stories

Prompt Injection Attacks Increase 32% as Adversaries Scale Operations Against Production AI

Google has analyzed AI indirect prompt injection attempts involving sites on the public web and noticed an increase in malicious attacks over the past months, providing the first large-scale empirical measurement of in-the-wild prompt injection prevalence. The research, published April 27, focused specifically on indirect prompt injection attempts embedded in websites accessible to AI agents, representing a subset of the broader attack surface.

Google researchers scanned website snapshots saved by Common Crawl for known prompt injection patterns and used Gemini and human reviews to weed out false positives. The analysis revealed a mixed threat landscape: harmless pranks, attempts to deter AI agents, search engine optimization, and helpful guidance coexist alongside genuinely malicious attacks designed to manipulate AI behavior.

An analysis of the identified prompt injections found harmless pranks, attempts to deter AI agents, search engine optimization, and helpful guidance, as well as some malicious attacks. In the destruction category, some prompts attempted to trick AI into deleting all files on a user’s machine, though researchers noted such attacks are unlikely to succeed given current system architectures and permission boundaries.

The sophistication assessment provides both reassurance and warning. While they did not see any particularly sophisticated attacks, the Google experts pointed out that they did see a 32% increase in malicious prompt injection attempts between November 2025 and February 2026. More concerning, researchers warned that both the scale and sophistication of prompt injection attacks are expected to increase in the near future.

The research confirms what security teams have suspected: adversaries are moving from proof-of-concept demonstrations to operational campaigns. On April 27, Google’s threat intelligence researchers warned that public web pages are being seeded with hidden instructions designed to hijack enterprise AI agents the moment they scrape the page, an attack class called indirect prompt injection. The timing is particularly concerning given that the industry is wiring agentic AI into everything that touches corporate data.

For defenders, the takeaway is clear: prompt injection is not a future threat requiring future mitigations—it is a current operational reality requiring detection, logging, and response capabilities today. Organizations deploying AI agents that interact with untrusted web content face measurable, quantified risk.

Pentagon Finalizes AI Deployment Strategy, Locking Out Anthropic Amid Safety Dispute

The Department of Defense announced Friday an agreement with seven major technology companies to use their artificial intelligence tools in its classified networks, not including Anthropic, which the Trump administration has blacklisted over Anthropic’s insistence that the Pentagon include certain safety guardrails for the government’s use of AI in warfare. The May 1 announcement represents the culmination of months of negotiations and a public rupture between the U.S. government and one of the leading AI safety organizations.

The companies involved in the deal are Elon Musk’s SpaceX, ChatGPT-maker OpenAI, Google, Microsoft, Nvidia, Amazon Web Services, and Reflection. The inclusion of Reflection, a newer entrant that recently raised $2 billion and is backed by 1789 Capital (where Donald Trump Jr. is a partner), signals a shift in how the U.S. military procures advanced technology.

Until recently, Anthropic’s Claude was the only AI model available in the Pentagon’s classified network, but President Donald Trump announced the administration would sever ties with the company after Anthropic refused to back down on terms that would allow the military to use Claude for “all lawful purposes,” including autonomous weapons and mass surveillance, and the Pentagon declared Anthropic a “supply chain risk,” a label only used in the past for companies associated with foreign adversaries.

The dispute has created a complex operational reality. Despite the supply chain risk designation, the DOD has been using Anthropic’s models to support its military efforts in the war in Iran, and the secretive National Security Agency is reportedly using Anthropic’s new and not yet publicly available Mythos model, which is said to have significant cyber warfare capabilities.

The White House reopened discussions with Anthropic in recent weeks after the company made significant announcements about several technology breakthroughs, specifically the Mythos model’s unprecedented vulnerability discovery capabilities. Anthropic CEO Dario Amodei visited the White House last month for a meeting with Chief of Staff Susie Wiles after Anthropic unveiled its Mythos tool that can identify cybersecurity threats — but also present a roadmap for hackers to attack companies or the government.

Defense Department CTO Emil Michael told CNBC Anthropic is still a supply chain risk, but that its Mythos model is a “separate national security moment”, highlighting the tension between policy positions and operational reality. The situation underscores a fundamental challenge: the most capable AI models may come from organizations with the strongest safety frameworks, creating a dilemma for national security decision-makers.

For the broader AI security community, this conflict illustrates the collision between offensive capability development and responsible AI deployment frameworks. As models gain capabilities that cross from defensive to offensive utility, the governance questions become increasingly difficult to resolve through contractual language alone.

Critical Linux Kernel Vulnerability Exposes AI Infrastructure to Privilege Escalation

Cybersecurity researchers have disclosed details of a Linux local privilege escalation (LPE) flaw that could allow an unprivileged local user to obtain root, with the high-severity vulnerability tracked as CVE-2026-31431 having a CVSS score of 7.8. The disclosure, which occurred on May 2, represents a significant threat to the Linux-based infrastructure underpinning most AI/ML workloads.

An unprivileged local user can write four controlled bytes into the page cache of any readable file on a Linux system, and use that to gain root, according to security researchers David Cohen from Xint.io and researchers from Theori. At its core, the vulnerability stems from a logic flaw in the Linux kernel’s cryptographic subsystem, specifically within the algif_aead module.

The vulnerability’s impact is amplified by its age and ubiquity. The issue was introduced in a source code commit made in August 2017, and successful exploitation of the shortcoming could allow a simple 732-byte Python script to edit a setuid binary and obtain root on essentially all Linux distributions shipped since 2017, including Amazon Linux, RHEL, SUSE, and Ubuntu.

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) on Friday added the recently disclosed security flaw impacting various Linux distributions to its Known Exploited Vulnerabilities (KEV) catalog, citing evidence of active exploitation in the wild. The KEV designation indicates that the vulnerability is being actively exploited by threat actors, elevating the urgency for remediation.

For AI security practitioners, this vulnerability is particularly concerning because Linux systems host the vast majority of machine learning infrastructure—from training clusters to inference servers to data pipelines. A local privilege escalation vulnerability allows attackers who have gained initial access (through phishing, supply chain compromise, or other means) to escalate to root and potentially poison training data, exfiltrate model weights, or manipulate inference outputs.

The vulnerability also highlights the tension between AI-specific security work and foundational infrastructure security. Organizations investing heavily in prompt injection defenses, model robustness testing, and MLOps security may overlook the fact that their entire AI stack runs on operating systems harboring decade-old privilege escalation vulnerabilities. Defense in depth requires addressing both AI-native threats and the traditional infrastructure attack surface simultaneously.

Framework & Standards Updates

On April 7, 2026, NIST released a concept note for an AI RMF Profile on Trustworthy AI in Critical Infrastructure, which will guide critical infrastructure operators towards specific risk management practices to consider when engaging AI-enabled capabilities. This represents NIST’s continued expansion of sector-specific guidance building on the foundational AI RMF 1.0 and the Generative AI Profile (AI 600-1).

NIST also released a preliminary draft of the Cybersecurity Framework Profile for Artificial Intelligence (Cyber AI Profile, NIST IR 8596) in early 2026. The preliminary draft of the Cyber AI Profile is organized around three Focus Areas: Secure (securing AI systems); Defend (conducting AI-enabled cyber defense); and Thwart (thwarting adversarial cyberattacks using AI). NIST also made available a discussion draft covering “Control Overlays for Securing AI Systems” including “Overview and Methodology” (NIST IR 8605) and “Using and Fine-Tuning Predictive AI” (NIST IR 8605A), which will serve as complements to the Cyber AI Profile.

The EU AI Act enforcement timeline continues to compress. The AI Act entered into force on 1 August 2024, and will be fully applicable 2 years later on 2 August 2026, with some exceptions: prohibited AI practices and AI literacy obligations entered into application from 2 February 2025; the governance rules and the obligations for GPAI models became applicable on 2 August 2025. The rules for high-risk AI will come into effect in August 2026 and August 2027. Organizations deploying AI in European markets have less than four months to achieve compliance with high-risk system requirements.

Vulnerability Watch

CVE-2026-31431 (CVSS 7.8) — Linux kernel local privilege escalation affecting all major distributions since August 2017. An unprivileged local user can write four controlled bytes into the page cache of any readable file on a Linux system, and use that to gain root. The vulnerability stems from a logic flaw in the kernel’s cryptographic subsystem (algif_aead module). CISA added this to the KEV catalog on May 2, citing active exploitation. All Linux-based AI infrastructure is potentially affected. Remediation: apply kernel updates immediately; no workaround available.

Python Lightning Supply Chain Compromise — In yet another software supply chain attack, threat actors have managed to compromise the popular Python package Lightning to push two malicious versions to conduct credential theft, with versions 2.6.2 and 2.6.3 both published on April 30, 2026. As of writing, the project has been quarantined by the administrators of the Python Package Index (PyPI) repository. PyTorch Lightning is an open-source Python framework that provides a high-level interface for PyTorch with more than 31,100 stars on GitHub. Organizations using this framework should audit dependencies immediately and verify installed versions.

Note: While CVE-2026-24747 (PyTorch checkpoint loading RCE) was mentioned in some sources, this CVE was covered extensively in last week’s digest and represents a continuation of previously reported issues rather than new disclosure this week.

Industry Radar

Anthropic announced a partnership with Google and chipmaker Broadcom to access multiple gigawatts of TPU-based computing capacity beginning in 2027, with a subsequent Broadcom securities filing putting that figure at 3.5 gigawatts, and the new Google investment expands that arrangement, with Google Cloud now providing a fresh 5 gigawatts of capacity over the next five years, with room to scale further. The deals come amid reports of widespread capacity constraints across Anthropic’s Claude service.

OpenAI announced that individual members of Trusted Access for Cyber accessing the company’s most cyber capable and permissive models will be required to enable Advanced Account Security beginning June 1, 2026. The new security features include hardware security key support via Yubico, addressing the increasing security requirements for users accessing models with offensive cybersecurity capabilities. This represents the first mandatory hardware security key requirement from a major AI provider.

OpenAI announced plans to acquire agentic AI security testing firm Promptfoo, made public on March 9, which will provide OpenAI with Promptfoo’s expertise in identifying and remediating security vulnerabilities in AI systems during development. The acquisition signals OpenAI’s investment in pre-deployment security testing capabilities for agentic systems.

Databricks acquired two startups to underpin Lakewatch: Antimatter, in an undisclosed deal that closed last year, and SiftD.ai, in a deal that closed just days before the announcement, with Antimatter bringing expertise in secure authentication and authorization for AI agents, and SiftD.ai, co-founded by the creator of Splunk’s Search Processing Language, adding deep knowledge in large-scale threat analytics and search. Databricks announced Lakewatch on March 24 as a new open, agentic SIEM platform designed to defend against AI-driven attackers.

Policy Corner

According to Article 57 of the AI Act, each Member State must establish at least one AI regulatory sandbox at the national level by 2 August 2026. The regulatory sandbox mechanism will allow organizations to test AI systems under regulatory supervision before full deployment, providing a pathway for innovation within compliance frameworks.

On 2 August 2026, the Commission’s enforcement powers in respect of GPAI model providers will come into force, and while the obligations of GPAI model providers came into force on 2 August 2025, the providers are given an adjustment period of one year before the Commission may start exercising its supervision and enforcement powers against them. The enforcement powers include the authority to request documentation, conduct evaluations, mandate compliance measures, and impose fines up to EUR 35 million or 7% of global annual turnover for prohibited AI practices.

The Pentagon’s designation of Anthropic as a “supply chain risk” represents an unprecedented use of national security procurement restrictions against a U.S.-based AI company. The Pentagon declared Anthropic a “supply chain risk,” a label only used in the past for companies associated with foreign adversaries, which could effectively blacklist Anthropic from the government. Anthropic sued the Trump administration in response, and a federal judge in California blocked the government’s effort in April.

Research Spotlight

Indirect Prompt Injection in the Wild: An Empirical Study (arXiv:2604.27202) — Published May 1, 2026. Researchers found that prompt injection is not yet a dominant threat, but it is already sufficiently real, structured, and widespread to deserve attention, and as agentic architectures mature and research on bot fingerprinting advances, more adaptive and targeted techniques may emerge, yielding prompts better tailored to specific agent behaviors and more effective at hijacking, degrading, or poisoning LLM-based ingestion pipelines. The study analyzed website snapshots from Common Crawl and identified 54 lexical templates that account for 95% of observed instances, with 65% of pages containing injections persisting over time.

Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs (arXiv:2505.04806) — The research evaluated over 1,400 adversarial prompts across four LLMs: GPT-4, Claude 2, Mistral 7B, and Vicuna, analyzing results along several dimensions, including model susceptibility, attack technique efficacy, prompt behavior patterns, and cross-model generalization. Among the tested models, GPT-4 demonstrated the highest vulnerability with an ASR of 87.2%, confirming its powerful but permissive instruction-following nature, while Claude 2 performed slightly better in filtering but still succumbed to 82.5% of attacks.

Prompt Injection Attacks in Large Language Models and AI Agent Systems: A Comprehensive Review (Information 2026, 17(1), 54) — Published January 7, 2026. This comprehensive review synthesizes research from 2023 to 2025, analyzing 45 key sources, industry security reports, and documented real-world exploits, and examines the taxonomy of prompt injection techniques, including direct jailbreaking and indirect injection through external content. The rise of AI agent systems and the Model Context Protocol (MCP) has dramatically expanded attack surfaces, introducing vulnerabilities such as tool poisoning and credential theft, and the review documents critical incidents including GitHub Copilot’s CVE-2025-53773 remote code execution vulnerability (CVSS 9.6) and ChatGPT’s Windows license key exposure.

What This Means For You

Prioritize prompt injection detection now. The 32% increase in attacks is a leading indicator. Implement logging, monitoring, and anomaly detection for all production LLM endpoints. If your AI systems ingest content from untrusted sources (websites, user documents, emails), you need runtime detection capabilities, not just input validation. Test your systems against the prompt injection patterns documented in this week’s Google research and the indirect injection study published on arXiv.

Audit Linux infrastructure immediately. CVE-2026-31431 is actively exploited and affects all major distributions since 2017. AI infrastructure teams running training clusters, inference servers, or MLOps platforms on Linux must patch immediately. CISA’s KEV designation means attackers have working exploits. This is not a drill. Verify patch levels across your entire AI stack, including container hosts, Kubernetes nodes, and GPU servers.

Prepare for EU AI Act enforcement. With less than four months until the August 2, 2026 deadline, organizations deploying AI systems in European markets should be in implementation phase now—not planning phase. High-risk system requirements include technical documentation, risk management systems, data governance procedures, and human oversight mechanisms. If you’re waiting for the rumored enforcement delay via the Digital Omnibus, you’re assuming legislative risk. Plan to the statutory deadline.

Tools and Resources

Promptfoo — Open-source LLM security testing framework. Following OpenAI’s acquisition announcement, the tool remains available as an open-source project for testing AI systems against prompt injection, jailbreaks, and other adversarial inputs. Supports automated red teaming and evaluation of defense mechanisms.

OWASP LLM Top 10 2025 — Updated risk taxonomy now includes System Prompt Leakage (LLM07:2025) and Vector and Embedding Weaknesses (LLM08:2025). 53% of companies rely on RAG and agentic pipelines, necessitating new entries for System Prompt Leakage and Vector and Embedding Weaknesses. Essential reference for application security teams building LLM-powered systems.

NIST Cyber AI Profile Preliminary Draft — Framework for securing AI systems and using AI for cyber defense. Provides actionable guidance across three focus areas: Secure (AI system security), Defend (AI-enabled defense), and Thwart (defending against adversarial AI). Comment period open through summer 2026.

EU AI Act Single Information Platform — Official European Commission resource for AI Act compliance questions, implementation timelines, and enforcement guidance. Essential for organizations deploying AI in European markets.

Key Highlights