diff --git a/tools/llm-sequential-upgrade/sequential-upgrade/STDB_COST_TRACKING.local.md b/tools/llm-sequential-upgrade/sequential-upgrade/STDB_COST_TRACKING.local.md
deleted file mode 100644
index 1c3db1d027..0000000000
--- a/tools/llm-sequential-upgrade/sequential-upgrade/STDB_COST_TRACKING.local.md
+++ /dev/null
@@ -1,100 +0,0 @@
-# SpacetimeDB Cost Tracking (per level)
-
-Local working doc. Costs are from each run's `cost-summary.json` (`totalCostUsd`,
-from Claude Code's `cost_usd` OTel attribute). All runs are **claude-sonnet-4-6**
-(published April runs confirmed sonnet via cost/token ratio; current run confirmed
-via raw telemetry).
-
-## Per-level totals (generate/upgrade + any fix iterations)
-
-| Level | Pub 20260403 | Pub 20260406 | Current 20260617-2 (post-fix) |
-|-------|-------------|-------------|-------------------------------|
-| L1    | $2.8412 (gen 1.3705 + 4 fix 1.4706) | $1.6457 (gen 0.8384 + 1 fix 0.8073) | **$1.6680** (0 fix) |
-| L2    | $1.1751 | $1.1816 | **$1.0361** |
-| L3    | $1.2901 | $0.9898 | **$1.2793** |
-| L4    | $0.9492 | $0.5896 | **$0.8456** |
-| L5    | $0.0574 ⚠ | $0.9215 | **$1.2242** |
-| L6    | $1.1623 | $1.5011 | **$1.1723** |
-| L7    | $0.7995 | $1.2543 | **$2.1762** (upg 1.69 + fix 0.48) |
-| L8    | $0.8281 | $0.9762 | **$1.6049** |
-| L9    | $1.7680 | $1.3869 | **$1.4577** |
-| L10   | $0.5679 | $0.8211 | **$0.6776** |
-| L11   | $1.0007 | $0.4019 | **$1.0669** |
-| L12   | $0.8855 | $0.9498 | **$1.4402** |
-| **Total** | **$13.3250** | **$12.6195** | **$15.6490** (L1–L12 COMPLETE) |
-
-## Cumulative cost-to-done (the "as we go" comparison)
-
-Running total through each level (includes fix iterations = true cost-to-done).
-
-| Through | Current 20260617-2 | Pub 20260403 | Pub 20260406 | Current vs cheaper pub |
-|---------|--------------------|--------------|--------------|------------------------|
-| L1      | $1.67              | $2.84        | $1.65        | +$0.02 |
-| L2      | $2.70              | $4.02        | $2.83        | −$0.13 |
-| L3      | $3.98              | $5.31        | $3.82        | +$0.16 |
-| L4      | $4.83              | $6.26        | $4.41        | +$0.42 |
-| L5      | $6.05              | $6.32        | $5.33        | +$0.72 |
-| L6      | $7.23              | $7.48        | $6.83        | +$0.40 |
-| L7      | $9.40              | $8.28        | $8.08        | +$1.32 |
-| L8      | $11.01             | $9.11        | $9.06        | +$1.95 |
-| L9      | $12.46             | $10.88       | $10.45       | +$2.02 |
-| L10     | $13.14             | $11.45       | $11.27       | +$1.87 |
-| L11     | $14.21             | $12.45       | $11.67       | +$2.54 |
-| L12     | $15.65             | $13.33       | $12.62       | +$3.03 |
-
-_L7 includes a presence fix (multi-connection offline bug) — the **same** bug mongo was
-dinged for. The published April runs did NOT fix it (no fix-level7 in their data; same code
-pattern), so our higher L7 reflects holding STDB to the 100% bar they weren't held to._
-
-So far the current (post-fix, 0-fix) run is **tracking at or below both published runs on
-cost-to-done**, because the published runs spent on L1 fix cycles that the current run didn't need.
-
-## vs Mongo — the true apples-to-apples (both fixed to 100%)
-
-Mongo (Express + Mongoose + Socket.io, run 20260616) was also graded to 100% with
-exhaustive fixes — the fairest comparison to the current STDB run. The published April
-STDB runs were NOT held to that bar.
-
-| Lv | feature | STDB current | Mongo |
-|----|---------|--------------|-------|
-| L1 | basic | $1.67 | $1.14 |
-| L2 | scheduled | $1.04 | $0.78 |
-| L3 | ephemeral | $1.28 | $0.77 |
-| L4 | reactions | $0.85 | $0.58 |
-| L5 | editing | $1.22 | $0.57 |
-| L6 | permissions | $1.17 | $0.90 |
-| L7 | presence | $2.18 | $2.79 |
-| L8 | threading | $1.60 | $1.76 |
-| L9 | private/DM | $1.46 | $1.45 |
-| L10 | activity | $0.68 | $1.15 |
-| L11 | drafts | $1.07 | $1.04 |
-| L12 | anon | $1.44 | $0.98 |
-| **Total** | | **$15.65** (1 fix) | **$13.92** (4 fixes) |
-
-- STDB +$1.73 (+12%) on cost, but **1 fix vs mongo's 4** (mongo: L1, L7, L8, L10).
-- On the bug-magnet levels STDB was **cheaper to-done**: L7 ($2.18 vs $2.79; mongo's fix
-  alone was $1.59/3 bugs), L8 ($1.60 vs $1.76), L10 ($0.68 vs $1.15).
-- STDB's premium is concentrated in heavy-but-clean features (L1, L5, L12) = the SDK-2.6
-  output cost, not debugging.
-
-## Setup differences (important for fair comparison)
-
-- **Published April runs (20260403, 20260406):** `rules: standard`, SpacetimeDB ~2.0.x SDK,
-  buggy templates (missing typescript devDep + `moduleResolution: node`). May NOT have been
-  fixed to the same exhaustive 100% bar (GRADING_RESULTS never published; sample data shows
-  some 2/3s) — so their totals are "cost to published scores," not necessarily "cost to 100%."
-- **Current run (20260617-2):** `rules: guided` + full official skills, SpacetimeDB 2.6.0,
-  **fixed templates**, graded to 100% (3/3 every feature), 0 fix iterations so far.
-
-## Notes
-
-- The "$0.84 published L1" figure cited in earlier notes was a **single session** — the
-  20260406 L1 *generate* ($0.8384), NOT the L1 cost-to-done. Both published L1s actually
-  cost ~$1.6–2.8 once their fix sessions are included.
-- Published 20260403 L5 = $0.0574 / 2 calls — a near-no-op (likely broken/trivial upgrade).
-- Published L1 "fixes" were scattered across the run timeline (some after later levels were
-  built), so summing them as "L1 cost-to-done" overstates L1; the clean comparison is
-  generate-to-generate.
-- Clean generate/upgrade comparison at matching levels: current is in line with published
-  per-level (e.g. L2 current $1.04 < both published $1.18); the SDK-2.6 premium shows up
-  mainly on the L1 generate.