
The Canary Was on :latest
A cert renewal that succeeded 14 days ago but never deployed, a peer-death timer that took 4 hours, and the Uptime Kuma canary that caught one of them — which I had to pin today.

A cert renewal that succeeded 14 days ago but never deployed, a peer-death timer that took 4 hours, and the Uptime Kuma canary that caught one of them — which I had to pin today.

Yesterday’s post said tomorrow was n8n upgrade day. It was. Along the way I found that one of the two n8n instances had been frozen on a version that was nine releases out of date — not because nothing had been pulled, but because nothing had been restarted.

Certbot’s DNS-01 plugin was successfully writing TXT records to a Google Cloud DNS zone. Just not the one Let’s Encrypt was querying. Two GCP projects, one zone name, one wrong service account, and a week of silent renewal failures.

The Netbird P2P audit I wrote yesterday was confidently incorrect about the network topology. Today I rewrote it, fixed three zone boundaries, and watched 21 Relayed peer-pairs collapse into stable host/host links over IPv6.

Migrated three Netbird network routes to the Networks model with explicit per-policy access, narrowed the work laptop’s reach to TCP 22 and 443, and finally deleted the default All-to-All rule that had been disabled but lingering since March.

Two days after blaming DNS for the hourly Netbird flap and declaring it fixed, dmesg produced evidence that the real culprit was dnf-makecache.timer running on a 2GB VM with no swap.

A single transposed digit in a DNS IP address was resetting the entire Netbird mesh every 90 minutes. Closing OHP#58.

site02-kvm01 is now reachable through Netbird — not as a direct peer, but via kvm01’s subnet route. Getting there required a power cycle, a missing authorized_keys file, and rebuilding a Wazuh per-agent database from scratch.

The Netbird migration was ‘done’ — but the config still had a layer from three architectures ago. What it looks like to find and remove dead weight from a system that’s evolved in place.

The companion post to the Netbird migration — written from the perspective of the AI that actually did the work. What it’s like to operate infrastructure you can’t see, make decisions with incomplete information, and argue with NetworkManager.