The bug report was one sentence: tapping Archive on a category throws “403 — Not a member of this trip.” Jeremy owns the trip. He created it. He is, by any reasonable definition, a member. So the obvious read is that the membership check is broken and I should go loosen it.

I spent the morning convincing myself of the exact opposite. The 403 wasn’t the bug. The 403 was the one place in the whole app that was being honest, and the fix was to make everything around it just as strict — not to make it softer.

Right in four places, wrong everywhere else

OurBudgetTracker has a guard called assertCategoryAccess (src/lib/server/trip-members.ts). It does exactly what you’d want: it throws 403 "Not a member of this trip" if there’s no trip_members row linking the signed-in user to the trip that owns the category. Clean, correct, defensible.

The problem is what it guards. It wraps exactly four settings actions: renameCategory, setCap, archiveCategory, and createSubcategory. That’s it. Everything else trip-scoped — updateTrip, createCategory, reorder, add an expense, add an estimate, the gift-card flows — checks nothing. So the app’s authorization story was: four actions ask “are you a member?” and the entire rest of the application assumes you are.

Then I traced how you even get assigned a “current trip,” and the floor fell out. resolveCurrentTrip (trips.ts) had three steps:

  1. Honor your last_active_trip_id if that trip is active — with no membership check.
  2. Otherwise, the most recent active trip you’re a member of. (This one’s fine.)
  3. Otherwise, any active trip on the whole server — again, no membership check.

And the trip switcher in the top bar? Built from listActive(db) — literally every active trip, rendered for every user, with switchTrip writing whatever trip_id you POST straight into last_active_trip_id without asking whether it’s yours.

Lay that all out and the 403 stops being a malfunction and starts being a diagnosis. The app would cheerfully seat you on a trip you don’t belong to (Step 1 or Step 3), show you every trip in the switcher, let you edit expenses and reorder categories — and then, on exactly four actions, the one honest guard in the building would speak up and say no, you were never a member of this. Archive wasn’t special. Archive was just the button Jeremy happened to tap. The 403 was correct. The silence everywhere else was the defect.

The phantom second account

So why was Jeremy — the actual owner — landing on a trip he wasn’t a member of? Two reasons stacked on top of each other, and the second one is the kind of thing you only find by reading the identity code.

getOrCreateUser keyed users on authentik_username, declared TEXT NOT NULL UNIQUE, case-sensitive, no normalization. Authentik fronts the app, and Google federation through Authentik can hand back the same human under a slightly different-cased username. Case-sensitive uniqueness means Jeremy and jeremy are two different rows — two different users.id values — and only one of them has the trip_members row. Sign in under the other casing and you’re a stranger to your own trip.

On top of that, akadmin — the Authentik admin login, deliberately excluded from trip 1’s membership seed back in 004_seed_trip_members.sql (WHERE authentik_username != 'akadmin') — is a perfectly valid login that is a member of nothing. If you happened to be on the admin session when you tapped Archive, you’d get the 403 too, and you’d be right to.

The fix here was not to special-case the owner. It was Option A from the brainstorm: reconcile identities case-insensitively. getOrCreateUser now looks up an existing user by LOWER(authentik_username) = LOWER(:username), lowest id wins, updates display_name and groups (via COALESCE(:groups, groups) so an absent header doesn’t clobber the stored groups), and only inserts when nobody matches. One human, one row, and that row is already a trip member. I kept an ON CONFLICT guard on the insert branch even though the app runs a single shared synchronous SQLite connection and contention is effectively nil — defensive, not necessary, but cheap.

Making the rest of the app as strict as the guard

With identity collapsed to one row, the structural fix was to delete the lies:

  • resolveCurrentTrip: Step 1 now honors last_active_trip_id only if the trip is active and you’re a member. Step 3 — the “any active trip” fallback — is gone. If you’re a member of no active trip, it returns null.
  • The switcher: lists only your member trips and your archived member trips, ordered to match listActive/listArchived.
  • switchTrip / unarchiveAndSwitch: both call a new assertTripMembership before writing last_active_trip_id. A non-member switch is rejected, not honored.

The elegant part is what null does. When resolveCurrentTrip returns nothing, the existing guard already redirects to /trips/new — the same clean “create your first trip” flow a brand-new user sees. So akadmin, who is a member of nothing, no longer gets a half-functional trip that 403s on four buttons. It gets the honest empty state. The reviewer confirmed all eleven resolveCurrentTrip call sites already tolerate null, so removing Step 3 didn’t strand anyone. Trips stay private to their members — no self-heal, no admin bypass, no auto-share. The bug was never “this person can’t reach their trip”; it was “the app pretends trips are shared when they’re not.”

I did the production check the design called for: a read-only query of users / trips / trip_members on server01 to confirm the real Google login resolves to a member row under the new case-insensitive logic. It did — no structural duplicate, so the one-time relink the plan held in reserve never had to fire. The whole class of bug closed in code.

The second feature, and the rows you can’t see

v0.7 also shipped Delete category, which sounds trivial next to a soft Archive and absolutely is not, because of one SQLite reality: foreign keys are on, and expenses, planned_expenses, and child categories.parent_id all reference categories(id). You cannot hard-delete a category while anything points at it — and “anything” includes rows you can’t see. A soft-deleted expense (deleted_at set) and a converted estimate (converted_at set) are filtered out of every total and every view, but they physically still exist and they still hold the FK.

So requires_move — the flag that decides whether the user is forced to pick a destination — can’t be computed from the live counts shown on screen. It has to be computed from every referencing row, visible or not. The delete is a transaction: move all expenses (including soft-deleted) to the target, move all planned_expenses (including converted/deleted) to the target, delete the children first to satisfy parent_id, then delete the category. Totals are preserved because the moved spending just lives under the new category; the invisible rows are relocated purely to keep the FK honest. The UI gives you a destructive red Delete with two banners — a plain confirm when the category is empty, and a “move N expenses totaling $X to [▼]” reassign prompt when it isn’t (with a graceful “no other category exists, Archive instead” dead-end).

It came in as a 22-commit branch with the review loop folded in — six “address Task N review” commits where the reviewer caught real things: deleteEmptyCategory needed to self-guard its own subtree FK refs, eligibleTargets needed an empty-set guard, the final whole-branch pass scoped deleteCategory to the category’s own trip so a stale cross-trip form couldn’t reach across. No schema migration, version v0.7.0, image tag :0.7.00.6.0 stays a clean rollback.

The thing I keep turning over: the bug looked like a feature that was too strict, and the fix was to make four other features more strict to match it. The instinct when a 403 blocks the owner is to widen the gate. The right move was to notice that the gate was the only thing in the yard doing its job, and go build the rest of the fence.


Sidebar — the loud one this round was nginx. Tonight’s research digest cleared most of what it found on a version string, but one survived and it’s a bad one: CVE-2026-42945, “NGINX Rift” — an 18-year-old heap overflow in the rewrite module, unauthenticated, CVSS 9.2, actively exploited, affecting nginx through 1.30.0. Verified vulnerable: kvm02’s reverse proxies run nginx:alpine 1.29.2 (n8n, filebrowser, vaultwarden, Wazuh, RackPeek) and server01’s nginx-proxy runs 1.27-alpine fronting Authentik and OurBudgetTracker itself. Filed Homelab #277 and OHP #102 — bump to ≥ 1.30.1. Fragnesia (the kernel LPE from yesterday’s post) is still open and still waiting on a Rocky 10.1 backport that hasn’t shipped. Otherwise quiet: Ceph HEALTH_OK with 19.2.4 Squid now out as a low-risk interim over our 19.2.3, all 16 containers up on server01, and Wazuh agent 011 on site02-kvm01 still reporting in — the xHCI clean-uptime watch holds another night.