Last Week in AI Security — Week of May 4, 2026

Executive Summary

The week of May 4, 2026, demonstrated how AI infrastructure vulnerabilities and social engineering attacks are converging to create novel threat vectors. Two major developments dominate the landscape: a critical unauthenticated memory disclosure flaw in Ollama affecting an estimated 300,000 internet-exposed servers, and a sophisticated malvertising campaign that weaponizes legitimate user-generated content platforms—specifically Claude.ai shared chats and Evernote—to distribute malware via Google Ads.

CVE-2026-7482, dubbed “Bleeding Llama,” allows remote attackers to exploit a heap out-of-bounds read in Ollama’s GGUF model loader to exfiltrate environment variables, API keys, system prompts, and concurrent user conversation data without authentication. The vulnerability’s CVSS score of 9.1 reflects its severity and exploitability. Default configurations bind to localhost, but the documented OLLAMA_HOST=0.0.0.0 setting is widely deployed, leaving approximately 300,000 servers publicly accessible. The disclosure process itself became a case study in CVE ecosystem dysfunction: after traditional channels stalled, Echo AI acted as an alternative CNA to assign the identifier and publish detailed technical context.

Simultaneously, attackers demonstrated a sophisticated understanding of trust boundaries by abusing Google’s advertising platform to deliver malicious instructions through genuine claude.ai shared chat URLs. Users searching for “Claude mac download” encounter sponsored results pointing to authentic claude.ai pages containing terminal commands that silently install malware. The campaign exploits a systemic failure across three layers: insufficient ad moderation at Google, delayed content review at Anthropic, and poor UX signaling that user-generated content hosted on claude.ai inherits brand trust without clear visual distinction.

On the policy front, EU legislators finalized the AI Omnibus package, reaching agreement just hours before the original August 2026 compliance deadline would have triggered high-risk AI system requirements across member states.

Top Stories

Critical Ollama Vulnerability Exposes 300,000 Servers to Memory Exfiltration

Cyera researchers disclosed CVE-2026-7482, a critical vulnerability (CVSS 9.1) in Ollama that enables unauthenticated attackers to leak the entire Ollama process memory, potentially impacting 300,000 servers globally. The flaw, codenamed Bleeding Llama, resides in how Ollama processes GGUF (GPT-Generated Unified Format) model files during quantization.

Ollama before version 0.17.1 contains a heap out-of-bounds read vulnerability where the /api/create endpoint accepts attacker-supplied GGUF files with declared tensor offsets and sizes exceeding actual file length, triggering reads past allocated heap buffers during quantization in fs/ggml/gguf.go and server/quantization.go. The leaked memory contents—including environment variables, API keys, system prompts, and concurrent user conversations—can be exfiltrated by uploading the resulting model artifact through the /api/push endpoint to attacker-controlled registries.

The vulnerability requires no authentication because the /api/create and /api/push endpoints lack authentication in upstream distributions, and while default deployments bind to 127.0.0.1, the documented OLLAMA_HOST=0.0.0.0 configuration sees wide adoption in production environments. This configuration choice transforms what should be a localhost-only service into a remotely exploitable attack surface.

The disclosure timeline reveals significant friction in the CVE assignment process. After submitting a CVE request through MITRE on March 2, 2026, and receiving no response despite follow-ups through April 26, researchers approached Echo, a third-party CNA, which assigned CVE-2026-7482 on April 28. This parallel track illustrates how traditional CVE assignment bottlenecks create visibility gaps that leave users unaware of critical patches—Ollama version 0.17.1 silently fixed the issue without flagging it as a security update.

Organizations running Ollama must upgrade to version 0.17.1 or later immediately. Deployments should implement authentication proxies or API gateways in front of all instances, never expose them to the internet without IP access filters, and rotate API keys and credentials if any server was previously internet-accessible. The incident underscores a broader pattern: AI inference frameworks designed for local use are being deployed in network-facing configurations without commensurate authentication controls.

Malvertising Campaign Weaponizes Claude.ai Shared Chats to Distribute Malware

A novel malvertising campaign discovered this week exploits an unexpected trust boundary: legitimate user-generated content on claude.ai and other platforms combined with Google’s sponsored search results. Attackers purchase Google Ads targeting queries like “Claude mac download,” which display sponsored results pointing to genuine claude.ai shared chat URLs containing malicious terminal commands that install malware.

Security engineer Berk Albayrak identified a Claude.ai shared chat presenting itself as “Claude Code on Mac” installation guide attributed to “Apple Support,” walking users through Terminal commands that silently download and execute malware. The attack chain relies on users trusting content hosted on claude.ai’s primary domain, where minimal visual distinction between official documentation and user-generated chats creates exploitable ambiguity.

The campaign demonstrates cross-platform expansion: attackers replicated the technique using share.evernote.com for user-generated content, proving the approach extends beyond any single service to any platform hosting UGC on trusted domains. Both Windows and macOS users are targeted with platform-specific payloads—Windows variants deploy InfoStealer malware, while macOS receives Mach-O backdoors via obfuscated shell commands.

The attack succeeds through layered trust exploitation. Hosting user-generated content on the main claude.ai domain may serve business or SEO objectives, but creates unjustifiably high trust levels for users unable to distinguish official pages from UGC; disclaimers are small, barely visible, and invisible on mobile. Google’s ad moderation failures compound the issue—sponsored results lend legitimacy that bypasses user skepticism.

Users should navigate directly to claude.ai rather than clicking sponsored results, verify that legitimate Claude Code CLI comes from official Anthropic documentation, and treat any terminal command paste instructions with extreme caution regardless of apparent source. The incident highlights a systemic failure across advertising platforms and UGC hosting practices that AI security teams cannot ignore.

EU Finalizes AI Act Simplification Hours Before Compliance Deadline

At 4:30 a.m. on May 7, EU legislators reached agreement on the AI Omnibus regulation amending the EU AI Act, concluding six months of negotiations conducted under exceptionally tight schedule to finalize procedures before the original August 2, 2026 deadline. The agreement represents a qualified success: negotiators postponed high-risk AI system requirements while introducing targeted simplifications and new prohibitions.

The Commission proposed adjusting timelines for high-risk AI system rules by up to 16 months so requirements apply only after confirming needed standards and tools are available; co-legislators treated the proposal with utmost priority given provisions were due to enter force August 2, 2026. The outcome provides breathing room for organizations still building compliance infrastructure while maintaining the Act’s core framework.

Key changes include: postponing AI regulatory sandbox establishment deadlines to August 2, 2027, reducing grace periods for transparency solutions from 6 to 3 months with new deadline of December 2, 2026, and opening consultations on draft AI transparency obligation guidelines while banning “nudification” apps to protect citizens.

The industrial AI treatment proved most contentious. Parliament sought to exempt regulated products—machinery, toys, medical devices incorporating AI as safety components—arguing existing sectoral regulation makes additional AI Act oversight burdensome duplication; the compromise establishes mechanisms to resolve conflicts between AI Act and sectoral legislation.

For organizations, until August 2, 2026, classify all AI systems and assess high-risk or prohibited categories while implementing risk management, human oversight, data governance, and transparency measures; by August 2 complete conformity assessments, technical documentation, CE marking, and EU database registration; after August continuously monitor updates and report incidents. The August 2026 deadline remains the critical compliance milestone despite simplifications.

Framework & Standards Updates

NIST Cyber AI Profile Released in Preliminary Draft

NIST released the preliminary draft of the Cybersecurity Framework Profile for Artificial Intelligence (NIST IR 8596), designed as a voluntary framework extending CSF 2.0 to cybersecurity risks and opportunities introduced by AI while complementing the AI RMF. The preliminary draft organizes around three focus areas: Secure (securing AI systems), Defend (conducting AI-enabled cyber defense), and Thwart (thwarting adversarial cyberattacks using AI). NIST also released companion discussion drafts on “Control Overlays for Securing AI Systems” (NIST IR 8605) and “Using and Fine-Tuning Predictive AI” (NIST IR 8605A). Comment period remains open.

MITRE ATLAS Expands Coverage for Agentic AI

MITRE’s May 2026 Secure AI update expanded ATLAS with new techniques, mitigations, case studies, a Technique Maturity filter, and rapid-response capabilities; the team transitioned ATLAS to monthly release cadence and expanded coverage for Agentic AI and LLM threats as agentic systems can independently make decisions, take actions, and interact across environments with reduced human oversight. As of version 5.1.0 (November 2025), ATLAS contains 16 tactics, 84 techniques, 56 sub-techniques, 32 mitigations, and 42 case studies; the February 2026 v5.4.0 update added agent-focused techniques reflecting accelerating AI threat evolution.

ISO/IEC 42001 Adoption Accelerates

No major ISO/IEC 42001 framework updates this week, but Microsoft’s progress toward ISO 42001 certification represents a pivotal achievement in responsible AI leadership, with independent validation providing customers assurance over application of the Responsible AI Standard for AI risk management throughout the AI lifecycle. Enterprises increasingly treat ISO 42001 as operational layer beneath regulatory compliance.

Vulnerability Watch

CVE-2026-7482: Ollama Heap Out-of-Bounds Read (CVSS 9.1)

Ollama before 0.17.1 contains a heap out-of-bounds read where /api/create accepts attacker-supplied GGUF files with tensor offsets/sizes exceeding file length, triggering reads past allocated buffers during quantization, leaking environment variables, API keys, system prompts, and user conversation data. Fix: Update to Ollama 0.17.1 or later. Restrict /api/create and /api/push access. Avoid 0.0.0.0 binding. Rotate credentials if previously exposed.

CVE-2026-44484: PyTorch Lightning PyPI Compromise

Lightning AI identified compromise of PyTorch Lightning PyPI package versions; security advisory published April 30, 2026, recommends upgrading to version 2.6.1 and revoking/rotating all internal credentials associated with release process. Investigation ongoing. Organizations using PyTorch Lightning should verify package integrity and update immediately.

CVE-2026-42248 & CVE-2026-42249: Ollama Windows Update Mechanism Flaws (CVSS 7.7 each)

Striga disclosed two unpatched vulnerabilities in Ollama’s Windows update mechanism following 90-day disclosure period since January 27, 2026; path traversal and missing signature check can be chained into persistent code execution when combined with on-login routine. Mitigation: Monitor OLLAMA_UPDATE_URL overrides and disable AutoUpdateEnabled if environment allows update response manipulation.

Additional PyTorch Vulnerabilities

CVE-2026-24747 identifies critical remote code execution in PyTorch’s weights_only unpickler allowing attackers to craft malicious checkpoint files (.pth) capable of corrupting memory and achieving arbitrary code execution when victims load files using torch.load(…, weights_only=True). PyTorch version 2.10.0 addresses this vulnerability; organizations should upgrade immediately, especially in environments processing model checkpoints from external sources.

Industry Radar

MITRE Secure AI Program Growth: The MITRE Secure AI Program, supported by 16 member organizations including Microsoft and JPMorgan Chase, focuses on expanding ATLAS with real-world observations and expediting AI incident sharing.
NIST CAISI Under-Resourcing Concerns: The Trump-aligned America First Policy Institute called CAISI “chronically underfunded” with approximately 30 total staff and $30 million total funding since 2024—less than similar centers in Canada and Singapore. An EO requiring pre-deployment review of frontier AI models would likely increase CAISI workload.
AI Investment Trends: Sonatype data shows malicious packages in public repositories grew from 55,000 in 2022 to 454,600 in 2025, with notable leaps in 2023 (GPT-4 release) and 2025 (agentic coding adoption year). Time-to-exploit decreased from over 700 days in 2020 to 44 days in 2025; Mandiant’s M-Trends 2026 found 28.3% of CVEs exploited within 24 hours of disclosure.
Microsoft, Google, xAI Grant Federal AI Model Access: The Center for AI Standards and Innovation at the Department of Commerce announced agreements with Microsoft, Google, and xAI for federal access to new AI models for national security testing, amid concerns over Anthropic’s Mythos capabilities.

Policy Corner

EU AI Act Enters Final Compliance Phase

The AI Act entered into force August 1, 2024, and will be fully applicable August 2, 2026; prohibited AI practices and AI literacy obligations entered application February 2, 2025; governance rules and GPAI model obligations became applicable August 2, 2025; high-risk AI systems embedded in regulated products have extended transition until August 2, 2027. The May 7 agreement postpones AI regulatory sandbox establishment to August 2, 2027, and reduces transparency solution grace periods from 6 to 3 months with December 2, 2026 deadline.

Trump Administration Weighs AI Safety Executive Order

Kevin Hassett, director of the National Economic Council, said the administration is studying an executive order to ensure new AI models are secure before public release, comparing the approach to FDA drug evaluation; the EO would provide a clear roadmap for future AI releases after proving safety. The White House Office of the National Cyber Director hosted two meetings last week with tech/cyber companies and trade groups to discuss security concerns raised by advanced AI models including Anthropic’s Mythos; the office is discussing an AI security framework requiring Pentagon-led safety testing for AI deployments across federal, state, and local government.

Colorado AI Act Safe Harbor

The Colorado AI Act explicitly cites NIST AI RMF compliance as grounds for an affirmative defense; organizations demonstrating compliance may qualify for safe harbor protections against enforcement actions, while penalties can reach $20,000 per violation enforced by the Colorado Attorney General.

Research Spotlight

No new peer-reviewed academic papers from the May 4-10 window were identified through search. Several framework updates and industry research reports were covered in other sections.

What This Means For You

Immediate Actions for Ollama Deployments

If your organization runs Ollama, treat this as a drop-everything patching event. Upgrade to version 0.17.1 immediately. Audit every deployment for internet exposure—do not trust firewall rules alone; actively scan for 0.0.0.0 bindings and accessible /api/create endpoints. Any instance that was ever reachable from the internet should be considered compromised from a secrets perspective. Rotate API keys, environment variables, database credentials, and any secrets that may have been resident in process memory. Implement authentication proxies in front of all Ollama instances going forward, and segment AI inference infrastructure to separate security zones.

Rethink Search Behavior and Developer Onboarding

The Claude.ai malvertising campaign demonstrates that sponsored search results are no longer a trustworthy shortcut to official software downloads. Update developer onboarding materials and security awareness training to emphasize direct navigation to vendor sites via bookmarks or manually typed URLs, never via search ads. For AI tools specifically, maintain an internal registry of official download locations and package manager commands. Consider browser extensions or DNS-level blocking for known malicious infrastructure, and monitor for unusual Terminal or PowerShell command execution patterns across developer workstations.

August 2026 EU AI Act Compliance Is Non-Negotiable

The AI Omnibus simplifications do not eliminate the August 2, 2026 compliance deadline for high-risk AI systems—they postpone certain obligations while tightening others. If you deploy AI systems to EU markets, complete your risk classification now. Map each system to the Act’s four tiers (unacceptable, high, limited, minimal). For high-risk systems, establish conformity assessment processes, draft technical documentation, implement human oversight mechanisms, and prepare CE marking workflows. The transparency requirement deadline moved up to December 2, 2026, giving less time to implement content provenance and synthetic media labeling. Treat these as engineering deadlines, not legal abstractions.

Tools and Resources

NIST Cyber AI Profile (NIST IR 8596) — Preliminary draft extending CSF 2.0 to AI-specific cybersecurity risks; public comment period open. Access at NIST
MITRE ATLAS v5.4.0 — Expanded framework with 16 tactics, 84 techniques covering agentic AI attack paths; monthly release cadence. Access at atlas.mitre.org
runZero Ollama Detection Query — vendor:=Ollama AND product:=Ollama AND source:runzero for identifying vulnerable instances. Blog post with details
Echo AI CVE Assignment — Alternative CVE Numbering Authority that processed CVE-2026-7482 after traditional channel delays; provides detailed technical records. CVE-2026-7482 page
EU AI Act Compliance Checker — Simplified tool for SMEs/startups to understand AI Act obligations. Access at artificialintelligenceact.eu

Key Highlights