.png)
December 10th 2025, by François Proulx, VP of Security Research @ BoostSecurity.io
TL;DR: 2025 didnât give us a new, magical Supply Chain vuln class;** instead it gave us attackers who finally started reading our manuals**.
From Ultralyticsâ pull_request_target 0âday (where a BreachForums post indicates they used our own poutine scanner to find it) through Kong, tj-actions, GhostAction, Nx, GlassWorm and both ShaiâHulud waves, the common pattern wasnât typosquats but Pipeline Parasitism: living off CI/CD, maintainer accounts and developer endpoints using the same tools and patterns we publishâŚ
.png)
December 10th 2025, by François Proulx, VP of Security Research @ BoostSecurity.io
TL;DR: 2025 didnât give us a new, magical Supply Chain vuln class;** instead it gave us attackers who finally started reading our manuals**.
From Ultralyticsâ pull_request_target 0âday (where a BreachForums post indicates they used our own poutine scanner to find it) through Kong, tj-actions, GhostAction, Nx, GlassWorm and both ShaiâHulud waves, the common pattern wasnât typosquats but Pipeline Parasitism: living off CI/CD, maintainer accounts and developer endpoints using the same tools and patterns we published to defend them.
The vuln mechanics stayed boring: shell injections and overâprivileged tokens. But they were operationalized with worms, invisible Unicode payloads, blockchain C2, and even wiper failsafes.
Thankfully, platforms are finally improving, yet âpwn requestâ is here to stay; the only sustainable answer is to treat pipelines as production systems and publish future research assuming adversaries are our most diligent readers!
Table of Contents
Introduction: The Uncomfortable Baseline
Chapter 1: The turning point: Ultralytics & BreachForums
Chapter 2: Pipeline Parasitism goes mainstream
Chapter 3: Invisible Enemies: Unicode & GlassWorm
Chapter 4: Ecosystem Scale: The Shai-Hulud Worms
Chapter 5: Analysis: Research as a Requirements Document
Conclusion: Defensible by Design
Introduction: The Uncomfortable Baseline
In 2025, the unsettling part wasnât that Supply Chain attacks suddenly became possible: it was that they started to look uncomfortably familiar. Campaign after campaign read like someone had taken years of CI/CD vulnerability research, SLSA Threat Model diagrams, conference exploitation demos and simply ran them in production.
This article is NOT an end of year âTop 10 Most Bad Ass Breachesâ recap. Itâs the story of a pivot point: the year where pipeline hardening guides, LOTP talks, and Open Source scanners stopped being just defensive artifacts and became part of the offensive toolkit. If you spent 2023-2024 mapping how PRs, bots, and build runners could go wild, 2025 felt like watching those PTSD-inducing Tabletop whiteboard scenarios replayed at ecosystem scale.
To understand why, we have to start with the uncomfortable baseline: we already knew the foundations were fragile. I hate to say it, but I saw that movie playing in my head several years ago as I exploited my first âpwn requestâ and had my own âOh Shit!â moment.
Fragile foundations, wellâdocumented
Filippo Valsorda captured the mood of 2025 in his essay âState of Supply Chain Compromiseâ. The thesis was simple and uncomfortable: our ecosystem is structurally fragile, compromise is inevitable, and we should stop pretending otherwise.
By the time his essay circulated through Blue Team Slack channels and newsletters, a loose coalition of researchers and builders had already spent a better part of the past couple of years stressâtesting CI/CD systems:
- Exposing how GitHub Actions, GitLab CI, Tekton, CircleCI and friends routinely run untrusted Pull Request code from forks with overly privileged secrets.
- Demonstrating Poisoned Pipeline Execution (PPE) and âpwn requestâ patterns where a single Pull Request (or GitHub Issue / Comment) can often trivially allow an attack to exfiltrate secrets (signing keys, tokens, etc.) and pivot to compromise artifact registries and release pipelines.
- Publishing tooling and methodologies to make this repeatable at scale.
A large portion of the work we did was intentionally public. We wanted maintainers and platform teams to see how bad things were. We responsibly disclosed hundreds of those vulnerabilities, so much so that we had to build an agentic pipeline to triage, automate validation and generate draft reports.
We Open Sourced scanners like poutine to statically catch unsafe workflow patterns. We wrote articles like âWeaponizing Dependabotâ to explain how seemingly benign automation can be chained into highâimpact attacks, and âSplitâSecond Side Doorsâ to show how BotâDelegated TOCTOU in CI can break your core Threat Model assumptions even when humans think they are âapprovingâ safe changes.
We talked at several conferences, we were invited on podcasts, we created a whole CTF training. Many of us were early contributors and cheerleaders for efforts like SLSAâs Source and Build track, trying to drag pipelines out of the â2005 PHP Web Appâ era of secure coding and into something more principled.
We werenât the only ones listening.
Chapter 1: The turning point: Ultralytics & BreachForums
Figure 1: The smoking gun. Threat actors explicitly citing our defensive tools.
Every field has a moment where a vague concern crystallizes into hard evidence. For CI/CD Supply Chain security, for me, that moment came in December 2024 with the Ultralytics incident.
A few months earlier, during routine research, we had flagged a painfully obvious injection bug in ultralytics/actions, a GitHub Action used in the YOLOv5 ecosystem. The pattern was depressingly familiar:
- A pull_request_target workflow.
- Attacker-controlled head branch name interpolated straight into a Bash shell injection.
- GitHub Personal Access Token loaded in memory during the execution of the build.
A textbook âpwn requestâ PPE scenario. We made a note to come back to it. Other fires were burning.
In early December, news broke: Ultralyticsâ PyPI package had been trojanized. A compromised GitHub Actions workflow had exfiltrated publishing credentials and shipped cryptomining payloads to unsuspecting ML users.
That was bad. The next discovery was worse.
Looking at Dark Web chatter using Flare, we found a freshly created BreachForums account. The actorâs posting history looked⌠concise:
- A first post: OpSec 101 đ¤Śđťââď¸.
- A second post, 12 hours before the Ultralytics compromise, dropping the exact 0âday chain against ultralytics/actions: complete with references to our own LOTP and poutine as to how the vulnerability had been found.
- A third post, soon after the incident, bragging that someone had used âPRs to leak secrets from build pipelines,â dropped a Monero miner, caused chaos, but made little money.
- The account never logged in again.
This wasnât vague speculation about âattackers probably read our blogs.â This was a threat actor explicitly crediting research tools and methodology, weaponizing it almost verbatim, and then disappearing.
It was the smoking gun.
From hackathon frustration to pipeline telemetry
We were frustrated. We had seen the bug months before. Our intuition was that it could have been caught even earlier, by looking at the ecosystem from the outside, correlating suspicious workflow changes, package publishes, and maintainer activity.
So we did what engineers do when theyâre annoyed: we built more plumbing.
Over a hackathon, we stitched together what became our Package Threat Hunter pipeline, ingesting the GitHub public events firehose, layering on some secret sauce, and trying to catch Build Pipeline exploits âin the actâ instead of reverseâengineering them weeks later.
Ten days later, Kongâs Kubernetes Ingress Controller incident happened, an unauthorized 3.4.0 release pushed through their CI, using legitimate build scripts and signing paths, but shipping a cryptominer.
Package Threat Hunter had captured the whole thing, minute by minute.
That experience locked in a mental model that 2025 would keep reinforcing: the interesting compromises were no longer simple typosquats. They were pipelineâcentric operations, where attackers:
- Find or create a CI/CD foothold.
- Reuse as much of the legitimate release machinery as possible.
- Blend their payloads into officialâlooking artifacts and ecosystemâtrusted channels.
And increasingly, they were doing it using our own research as a field guide.
Chapter 2: Pipeline Parasitism goes mainstream
The first quarter of 2025 made it clear that Ultralytics and Kong werenât anomalies.
**Actions on Actions: **tj-actions/changed-files
In March, we saw what happens when an attacker decides to treat GitHub Actions themselves as a transitive Supply Chain.
By compromising the maintainer account for tj-actions/changed-files, a hugely popular Action used in tens of thousands of workflows, an adversary was able to backdoor all existing versions, injecting a oneâliner shell payload that ran in the context of each consumerâs workflow.
This wasnât subtle. It was âGitHubâActionsâonâGitHubâActionsâ, a meta Supply Chain attack squarely aligned with the PPE patterns offensive researchers had been demonstrating for years. A popular vulnerable GitHub Action is the holy grail; the blast radius can be gigantic. Itâs a side-door 0-day to any workflow having it as a dependency.

Figure 2: The meme from our Under The Radar talk slide deck now reality (early 2024)
GitHubâs response was equally telling: they took the unusual step of globally yanking tags, knowingly breaking builds to cut off impact. Thatâs the kind of move platforms make when they recognize theyâre dealing with systemic, not local, risk.
GhostAction: workflows as malware
By September, GhostAction pushed the idea further. Here, the âmalwareâ wasnât in a dependency: it was a GitHub Actions workflow.
Compromised accounts (ATO) received a malicious YAML file wired to run on every push and pull request. The logic was simple but devastating: on every execution, the workflow walked through all available secrets it could reach: GitHub, Docker Hub, NPM, PyPI, and cloud providers, then bundled them up and exfiltrated them off to attackerâcontrolled infrastructure.
GitGuardianâs telemetry later showed hundreds of accounts and thousands of secrets impacted. More importantly, GhostAction behaved like a crude GitHubânative worm. Once the malicious YAML landed in a few places, new tokens were exfiltrated and the cycle repeated.
Nx and the s1ngularity campaign
The Nx incident followed a similar script to Kong, with a twist. The root cause was depressingly familiar: a flawed GitHub Actions workflow wired to pull_request_target that shoved unsanitized PR title and body straight into a shell, giving attackers highâprivilege RCE in the repo. The gut punch came when Adnan Khan pointed out that this workflow hadnât been handâwritten at all: it was generated and committed by Claude Code. In other words, the attacker walked through a CI backdoor that an AI had casually stamped into the codebase for them before chaining other AI CLIs for recon in the malware itself. This is exactly why weâve been wiring tools like poutine into MCP servers and code assistants: if you donât put a LOTPâaware linter in the loop, your âhelperâ can quietly gift attackers the perfect pull_request_target 0âday.
Grafana: canaries in the CI coal mine
If Nx was a reminder that AI can write you a perfect CI 0âday, Grafana was the reminder that good instrumentation can still save you.
In April 2025, Grafana Labs had their own pull_request_targetâpowered PPE: an insecure workflow let an attacker run a carefully crafted branch name through a shell, exfiltrating environment variables and a handful of credentials from a GitHub Actions job. On paper, thatâs the same story as Ultralytics or Nx, a classic pwnârequest bug wired straight into CI.
The difference is how it ended. Grafana had seeded their environment with canary tokens, highâvalueâlooking AWS keys and other decoys whose only purpose was to scream if anyone touched them. When the attacker validated one of those keys, Grafanaâs team got an immediate page, swarmed the incident, rotated everything, and confirmed there was no production or customer impact. They followed up by hardening their workflows, leaning on tools like Zizmor, a scanner in the same family as poutine that catches vulnerable Actions at scale, and later wrote publicly about how they design and place canaries.
Grafanaâs experience is worth calling out because it shows that Pipeline Parasitism isnât automatically a death sentence for a wellâinstrumented, wellâstaffed project. Most small, understaffed but wildly popular dependencies will never have a dedicated SRE on pager duty; expecting them to handâcraft canaries and incident playbooks is wishful thinking. The lesson here is less âeveryone should do what Grafana didâ and more âthis kind of tripwire should exist as a platformâlevel featureâ. Registries, CI providers, or security tooling can make it cheap and mostly automatic for the long tail, not just a bespoke trick reserved for the Grafanas of the world.
Chapter 3: Invisible Enemies: Unicode & GlassWorm
As defenders got better at spotting crude pipeline abuse, attackers adapted in two directions: invisibility and persistence. Two campaigns in particular: the first Unicodeâsteganography NPM malware and GlassWorm, felt like they were written specifically to humiliate our assumptions about what âobviousâ malware looks like.
The os-info-checker-es6 family looked like yet another forgettable systemâinfo helper until fellow researchers pulled it apart and realized the malicious logic wasnât just obfuscated, it was missing from the tools we trusted to show us code. The payload was woven through Unicode Private Use Area (PUA) which are zeroâwidth characters in a way that made the bad branches completely invisible in VS Code, vim, Emacs, and GitHubâs web diff. You could review the file line by line and never see the teeth. That shock is what led us to build an Open Source tool we called puant (âstinkyâ in French), whose only job is to sniff out suspicious PUA and zeroâwidth usage so maintainers have at least a fighting chance of noticing when a diff is lying to them. Pair that with a C2 channel that hid commands and exfiltrated data inside Google Calendar events, and you get tradecraft that feels a lot closer to Lazarus Group / âContagious Interviewââstyle operations than to a bored cryptominer.
GlassWorm adapted this technique to target VS Code and Open VSX publishers directly. They used the invisible Unicode trick to hide activation logic within benign-looking extensions, but paired it with a sophisticated worming capability that used stolen tokens to infect the maintainersâ other projects. Perhaps most alarmingly, they hardened their infrastructure by using the Solana blockchain for C2 communication, ensuring their command channel remained up even if their servers were nuked.
Taken together, the first UnicodeâPUA NPM malware and GlassWorm were a visceral reminder that âjust read the diffâ or âjust check the extension reviewsâ is no longer a sufficient review strategy, and that developer endpoints: IDEs and extension ecosystems are now firstâclass targets in their own right, not collateral damage.
Chapter 4: Ecosystem Scale: The Shai-Hulud Worms
By late summer, attackers stopped sniping individual packages and went straight after the humans who maintain the building blocks of the JavaScript ecosystem. The Great NPM Heist against Qixâmaintained extremely popular and critical packages (chalk, debug, ansi-styles, and many more) was pure social engineering: a convincing phishing campaign spoofing NPM support and a fake TOTP management page were enough to turn a single maintainer account into a distribution hub for malicious versions of dozens of core libraries. The payload was a crypto stealer quietly swapping wallet addresses in any app that handled them. It never fully realized its potential, but it proved the point: compromise the right person, and you can poison billions of weekly installs.
The first ShaiâHulud wave took that insight and automated it. Instead of using a stolen NPM token to backdoor a single package, the wormâs post install hook enumerated every package owned by the victim maintainer, injected a huge obfuscated payload, and republished new versions. Exfiltration flowed into GitHub repositories acting as dropboxes for stolen tokens and secrets. Within days, âa few compromised packagesâ had become hundreds of trojanized libraries and thousands of downstream repos touched, a maintainerâlevel worm, not a oneâoff incident.
ShaiâHulud 2.0 was the same concept turned up to eleven. Compromised packages dropped a preinstall script that installed or located the Bun runtime and spawned a heavily obfuscated bun_environment.js. On any machine with an NPM token, the worm walked the maintainerâs entire portfolio, backdooring and republishing packages en masse and spreading across several hundreds of packages. The payload was tuned for developer endpoints and CI: it hunted GitHub PATs, NPM credentials, cloud keys, SSH keys, then exfiltrated them via public GitHub repos literally branded âSha1âHulud: The Second Coming,â often mixing data from multiple victims.
On machines with enough access, ShaiâHulud 2.0 went further and registered the host as a GitHub Actions selfâhosted runner, wiring in attackerâcontrolled workflows so that revoking NPM tokens wasnât enough, you now had a durable, GitHubâmediated RCE path back into a victimâs laptop. And when it couldnât steal or persist, no network, no usable tokens, it sometimes fell back to a blunt option: recursively deleting everything writable under the userâs home directory. At that point youâre not dealing with âjustâ a Supply Chain compromise, but with an ecosystem worm, a developerâendpoint infostealer, a GitHub-proxied backdoor, and a wiper rolled into one, all rooted in the same lesson as the Heist: maintainers are the real root of trust.
Chapter 5: Analysis: Research as a Requirements Document
Looking across Ultralytics, Kong, tj-actions, GhostAction, Nx, GlassWorm, the Great NPM Heist, and both ShaiâHulud waves, a pattern emerges on the pipeline and maintainer side of the house:
- The core vulnerability classes are not new. They are the same boring primitives we have been talking about for a decade: unsafe string interpolation into shells, overâprivileged tokens and runners in CI, maintainer account takeover, and sketchy package install hooks combined with excessive postâpublish trust.
- Whatâs new is the level of operationalization. Attackers now use tooling to scan for the exact antiâpatterns defenders look for and build wormâlike propagation mechanisms across maintainer portfolios and ecosystems, and design obfuscation and C2 channels specifically to defeat human review and commodity detection.
- Our own research has become a roadmap for this specific class of attacks. Blog posts and talks about PPE, Dependabot abuse, and BotâDelegated TOCTOU lay out exactly how a misconfigured pull_request_target or overeager bot can become an RCE and secrets factory. Open Source tools like poutine or GatoâX encode those patterns in code, making it trivial to scan large swaths of GitHub for promising CI/CD antiâpatterns. Hardening guides and disclosure timelines document which combinations of triggers, permissions, and secrets lead to which impacts, effectively prioritizing targets for anyone willing to read.
Campaigns like rand-user-agent or os-info-checker-es6 live in a slightly different bucket: they were born malicious, crafted from day one as wateringâhole artifacts (tiny helpers, convenience libraries, âjustâtrustâmeâ interview tools) rather than compromises of longâstanding, trusted packages. They donât need our research to exist; registry incentives and the economics of ânpm install whatever Stack Overflow saysâ are enough.
But even there, the ecosystem reaction is shaped by the same research. The reason we recognized the Unicode PUA trick, built tools like puant, and could talk about blockchainâbacked C2 with any precision is because the community has spent years poking holes in how we review, ship, and consume code.
Attackers are doing what we would do in their position: reading everything, cloning everything, scripting everything. That doesnât mean we should stop publishing research. It does mean we need to internalize a hard truth: there is no longer a clean line between âdefensiveâ and âoffensiveâ Supply Chain research.
Every hardening guide is also a gap analysis for the next campaign. Every proofâofâconcept is a template. Every detection rule is a hint about how to evade the next version.
Conclusion: Defensible by Design
The good news is that the ecosystem is responding. GitHub is on the verge (in early December 2025) to tighten, partially, the Threat Model around pull_request_target and branchâprotected environments, chipping away at the âeasy modeâ PPE paths that powered Ultralytics and many copycats. NPM has changed token policies, including expiration for publish tokens, directly in response to the first ShaiâHulud wave, making stealing publish tokens slightly less attractive. Extension marketplaces are starting to treat publisher accounts and signing keys as critical infrastructure, not just UX plumbing.
These changes matter. Some of the most egregious lowâhanging fruit is finally disappearing.
But as Clint Gibler put it in the TL;DRsec #307 newsletter referring to ShaiâHulud 2.0, thereâs a different kind of unease too: after each ecosystemâlevel incident, we spin up a flurry of >80% overlapping vendor posts that can feel, in his words, like âa lot of duplicate work,â more marketing than new signal. The real opportunity cost isnât just readersâ attention, itâs all the expert time that could be going into longerâterm, solveâthisâclassâofâproblem work instead of reâdescribing the same campaign every few months.
Right now, most CI/CD platforms are effectively RCE As A Service. Weâve handed every repo a Swissâarmy knife of triggers, runners, and integrations, plus a Threat Model you only really understand if you treat the docs as light bedtime horror. There are too many choices, too many sharp edges, and too many caveats that only surface after someone chains them into a real attack.
Thatâs barely acceptable for wellâfunded product teams; itâs untenable for the tiny, underâresourced Open Source projects that end up underpinning half the internet. We canât keep playing the âSSO taxâ game where enterpriseâgrade detection and response sits behind paywalls while the weekend pet project, the one that quietly becomes that XKCD comic everyone loves to share about a single unpaid maintainer holding up the world, gets the most dangerous defaults and the fewest guardrails, and we all keep shrugging and reposting the joke instead of fixing the incentives.
If 2025 was the year attackers industrialized Living Off The Pipeline, 2026 needs to be the year we start designing pipelines and ecosystems that are defensible by design.
That means more batteriesâincluded, pavedâroad platforms and fewer sharp edges, especially for Open Source: safer defaults, opinionated workflows behavior, and builtâin tripwires that make the secure path the easy path instead of an expertâonly side quest.
That points to a few concrete shifts:
- Treat pipelines as production systems, not glue code. Model GitHub Actions, GitLab CI, Tekton and friends as alwaysâon RCEâasâaâService platforms, not scripting conveniences; apply real Threat Modeling, enforce least privilege on tokens and runners, and separate untrusted PR evaluation from trusted release processes instead of letting everything run in one amorphous âbuildâ stage.
- Harden the human perimeter around maintainers and developers. Make hardware keys and phishingâresistant MFA table stakes for registry and marketplace accounts, give maintainers security awareness that matches their actual risk profile, and run endpoint protection tuned for developer workflows rather than generic office laptops.
- Invest in ecosystemâlevel telemetry and coordination. Share behavioral patterns for pipelineâcentric attacks across vendors, build rapid crossâregistry response muscle for maintainer compromises, and support community efforts like community malware databases (OpenSSF / OSV / OSMDB), sandboxing infrastructure, and canaryâstyle tripwires that can be baked into CI and registries themselves, the kind of instrumentation that helped Grafana turn a live pull_request_target exploit into a quickly contained non-event, but now made cheap and mostly automatic for the long tail of tiny, heavilyâused packages.
- Publish research with an eye toward abuse. Focus proofsâofâconcept on classes of issues rather than turnkey exploit chains, pair offensive insights with concrete, implementable mitigations, and assume by default that adversaries will be your most diligent readers.
Finally, we should keep doing what worked best in 2025: collaborating as humans.
As an olive branch to ourselves, we could make that collaboration a little more intentional: a small, vendorâneutral Signal group where the people who actually reverse this stuff, ship the PoCs, and write the hardening guides can compare notes outside of weeks where marketing puts us in the line of fire. A place to trade weird telemetry, halfâbaked ideas, and âdoes this look like the next Shai-Hulud or am I just tired?â questions, ideally without accidentally reâcreating a "SignalGate" of our own by inviting the threat actors into the chat đ¤Ł.
The reason many of these campaigns were understood and contained as quickly as they were is that individuals across companies and communities quietly shared intel. People like Adnan Khan pushing the boundaries of CI/CD exploitation so others could understand the risk. Rami McCarthy sending a friendly âURGENT PSA - [redacted] got hit by the Nx issue. Revoke everything. Sorry!â DM when telemetry suggested one of our own employees had been hit. William âWoodyâ Woodruff keeping tools like Zizmor sharp so GitHub workflow bugs donât stay bugs for long. Paul McCarty curating the Open Source Malware database and building tools like undelete that let us see what really happens on shady Asian NPM mirrors. Charlie Eriksen being one of the first on scene reversing new blobs so the rest of us donât have to start from raw hex. Aviad Hahami et al. doing a massive forensics job retracing the first innings of complex campaigns like tj-actions. Analysts painstakingly teasing signal out of obfuscated bundles. Maintainers disclosing painful compromises in public so others could learn.
The irony of 2025 is that while attackers were busy weaponizing our research, the most effective defense was still relationships, Signal groups, DMs, and lateânight calls where we compared notes and turned isolated incidents into a coherent picture.
If we do this right, the story we tell a year from now wonât be âlook how bad ShaiâHulud 3.0 was.â
It will be: attackers kept reading our manuals and it stopped helping them.


Split-Second Side Doors: How Bot-Delegated TOCTOU Breaks The CI/CD Threat Model

François Proulx
TL;DR: A routine disclosure unraveled a class of Bot-Delegated Time-Of-Check to Time-Of-Use race...

The tale of a Supply Chain near-miss incident

François Proulx
TL;DR: We disclosed to Chainguard in December 2023 that one of their GitHub Actions workflow was...