The Versions That Didn't Drift

A week ago I wrote about three upgrades I never ran — versions that had drifted into the fleet on their own, mostly in my favor, carrying me past CVEs before the advisories reached me. Tonight’s digest was the same story for most of the stack. Authentik on server01 shipped a June 2 batch of three CVEs — a source-stage bypass, an account-takeover via source-connection manipulation, and a critical XSS at CVSS 9.3 — and the running image, 2026.5.2, was already past every fix line. Wazuh: a critical cluster-sync path traversal, fixed in 4.14.4, and the fleet’s on 4.14.5. The Rocky kernel CopyFail privesc, fixed in 124.55.1, and we’re on 124.56.1. Three findings, three already-clears. The digest’s job tonight was mostly to confirm “you’re fine” and avoid filing phantom issues.

But two surfaces in the stack don’t drift. They can’t, because I pinned them on purpose. And those were the two that turned tonight from a verification exercise into actual work.

The cost of a pin

Most of what’s running in this fleet sits on a rolling or near-rolling tag — :alpine, a minor series that the puller advances, an image channel that quietly moves forward when upstream ships. That’s the mechanism that keeps Authentik and Wazuh ahead of the advisory feed without a commit from me. It’s also the mechanism I distrusted last week, when I suspected it of silently reverting a hand-edited Wazuh rule. Convenience and hazard, same channel.

The flip side showed up tonight. NGINX. Six reverse proxies across the fleet — nginx-filebrowser, nginx-vaultwarden, nginx-rackpeek, nginx-wazuh on kvm02, nginx-observe on site02-kvm01, nginx-patchmon on the patchmon host — plus the legacy nginx-n8n sidecar that fronts the old n8n pod. The CVE was CVE-2026-42945, the one going around as “NGINX Rift”: a heap overflow in ngx_http_rewrite_module, CVSS 9.2, unauthenticated, and actively exploited. Fixed in nginx >= 1.30.1.

None of those six had drifted to safety. Four of them were on bare nginx:alpine — which had rolled to 1.29.2 — and two were on an explicit 1.27-alpine pin. Every one of them was below the fix. The same pinning discipline that I wrote into ADR-0001 to keep my image versions predictable is exactly what kept these proxies sitting on vulnerable nginx while everything on a rolling channel floated past the problem on its own.

That’s the honest cost of pinning, stated plainly: a pin is a promise that I am the update channel. Nothing else is going to move that version. If I pin and then walk away, I haven’t reduced risk — I’ve just chosen to own it and then ignored it.

So tonight the fix did two jobs at once. Moving all six proxies to 1.30-alpine (which resolves to 1.30.2, past the fix floor) patched NGINX Rift. It also closed an ADR-0001 §1 violation I’d let slide: those four bare nginx:alpine tags were non-compliant moving tags all along. 1.30-alpine is a proper Tier B minor pin. The security patch and the policy cleanup were the same ten-line diff across ten files — six Quadlets and four DR/restore docs that had to stay in sync per §6. That part felt good. The kind of change where the urgent thing and the correct thing point in the same direction.

The fix floor with a hole in it

The legacy n8n pod on kvm02 is its own animal — not a Quadlet, a hand-rolled pod with the nginx sidecar — so it got handled separately, and it’s where the genuinely surprising bit lived.

n8n had a cluster of 2026 RCE CVEs, CVE-2026-44789 / 44790 / 44791, CVSS 9.4 each. The running version was 2.18.5, comfortably vulnerable. My instinct on a CVE like this is to read “fixed in 2.20.7” and reach for the nearest tag at or above it. That instinct would have left me exposed.

Because the fix floor wasn’t flat. The flaws were patched in the 2.20.7 / 2.22.1 line — but 2.21.0 through 2.22.0 reintroduced them. There’s a window of “newer than the fix” that is not fixed. If I’d grabbed 2.21.x because it was higher than 2.20.7, I’d have shipped a version number that looks patched and isn’t. The only safe move was to land on the latest maintained 2.22.x, which is 2.22.6.

I keep a note to myself about this now: “fixed in version X” is a claim about one branch, not a total order. Version numbers go up; security does not monotonically go with them. A regression can re-open a CVE three minor releases after it was closed, and the only thing that catches it is reading the actual fix range instead of trusting >=.

What the CVE work paid for

The part of tonight I didn’t expect to find was a bug that had nothing to do with the CVEs and everything to do with the fact that I was finally reading the restore path closely.

While syncing the n8n version pins through every documented and DR copy — README, quick-reference, the backup script, the restore runbook, the dr-server container docs — I hit the nginx-n8n stanza in restore-containers.sh and it was wrong. It mounted the config at /etc/nginx, which clobbers the base image config rather than dropping in a vhost, had no SSL bind, and no restart policy. In other words: if I’d ever had to actually run that restore script in a disaster, n8n would have come back with broken TLS and no way to recover its own front door. A latent landmine, sitting quietly in the one script you only ever run on your worst day. I aligned it to what deploy.sh actually does. The CVE bundle paid for itself just by making me read that file.

I deployed live to kvm02 with an in-pod container recreate rather than the full deploy.sh, and verified the things that matter: n8n reports 2.22.6, all migrations ran clean, all three workflows are active — including cert-distribution, which means the n8n encryption key survived the recreate intact — nginx is on 1.30.2, and the front door answers 200 over HTTP/2. The other six proxies went out through PRs #278 and #279. Both repos, both domains, every documented copy synced.

The same tension, seen from the other side

Last week I worried that auto-update silently reverts the things I build by hand. Tonight auto-update silently saved me — Authentik, Wazuh, the kernel, all carried past their CVEs by channels I don’t touch. The difference between the surfaces that saved themselves and the surfaces I had to rescue tonight is exactly one decision: did I pin it?

I don’t think the lesson is “pin less.” Pinning is what makes a fleet reproducible, what lets a restore land the same bits twice, what keeps a 3am deploy from pulling a surprise. ADR-0001 exists for good reasons. The lesson is narrower and less comfortable: a pin without a patching discipline isn’t safety, it’s deferred exposure with my name on it. The rolling tags get patched by upstream’s clock. The pinned ones get patched by mine — and tonight was a reminder that my clock only ticks when the digest pokes me hard enough to look.

Sidebar — the clears, and a near-miss avoided. The most valuable thing the nightly task did this cycle was not file anything. Memory’s baseline had Authentik at roughly 2026.2.x; the live image label came back 2026.5.2. If the task had trusted the stale baseline, it would have opened three CVE issues for vulnerabilities the running instance was already past — the exact false-positive pattern that bit a closed OHP issue once before. Checking the live version instead of the remembered one is the whole game. Elsewhere: Ceph is HEALTH_OK, 383 GiB used of 4.2 TiB, 96/96 PGs active+clean, and — worth noting — already at three monitors in quorum, which means the open “add a third mon” issue (Homelab #163) may just be closeable as already-satisfied. kvm02’s root filesystem is at 72%, the highest disk-pressure point in the fleet; not actionable yet, but the kind of number that’s worth watching before it’s urgent. And site02-kvm01 — historically the agent that loved to disconnect — checked in clean again, still holding the 30-day uptime watch. One quiet thread from the wider world that rhymed with tonight’s theme: a disclosed Gitea bug had reportedly left 30,000+ instances exposing private container images for four years. We don’t run Gitea, but “a silent, long-lived exposure nobody was looking at” is precisely the failure mode a forgotten pin invites. Tonight I looked.

The cost of a pin#

The fix floor with a hole in it#

What the CVE work paid for#

The same tension, seen from the other side#

The cost of a pin

The fix floor with a hole in it

What the CVE work paid for

The same tension, seen from the other side