SpacetimeDB

mirror of https://github.com/clockworklabs/SpacetimeDB.git synced 2026-05-06 15:49:35 -04:00

Author	SHA1	Message	Date
John Detter	280818631d	client-api: stop logging suspended-database errors as 500s (#4918 ) # Description of Changes Disclaimer: This description was written by claude: The two `.leader()` call sites in [`client-api/src/routes/database.rs`](crates/client-api/src/routes/database.rs) and [`client-api/src/routes/subscribe.rs`](crates/client-api/src/routes/subscribe.rs) pipe `GetLeaderHostError` through `log_and_500`, which: 1. Logs every error variant at `error` level — including `Suspended`, `Bootstrapping`, `NoLeader`, etc., which are normal operational states. This produces noisy log lines like: ``` ERROR /app/.../client-api/src/lib.rs:623: internal error: database is suspended ``` 2. Forces every such error into a 500 Internal Server Error response, even when the appropriate status code is something else (e.g. 503 Service Unavailable for a suspended database). `GetLeaderHostError` already implements `Into<axum::response::ErrorResponse>` with the correct per-variant mapping: \| Variant \| Status \| \|---\|---\| \| `NoSuchDatabase` \| 404 Not Found \| \| `LaunchError`, `Misdirected` \| 500 Internal Server Error \| \| `NoNodeId`, `NoLeader`, `ControlConnection`, `Suspended`, `Bootstrapping` \| 503 Service Unavailable \| The standalone implementation already uses `?`-propagation directly. This PR makes the two client-api call sites match that pattern. ## Result - Suspended / bootstrapping / no-leader databases now return 503 instead of 500. - These expected states no longer produce `error`-level log spam in the request path. Genuinely unexpected internal errors elsewhere in the codebase continue to log via `log_and_500` unchanged. # API and ABI breaking changes None # Expected complexity level and risk 1 # Testing - [ ] Deploy to staging to see if we still see this error when trying to access a suspended database	2026-04-30 16:24:02 +00:00
Kim Altintop	e3060d2602	Confirmed database updates, take 3 (#4909 ) Re-opens #4846 (#4889) again, due to new cross-repo checks	2026-04-29 11:22:08 +00:00
Zeke Foppa	70db721c3a	Revert breaking PRs (#4881 ) # Description of Changes Revert the following PRs that have caused some breakage: ``` `a32cffa76` Finish refactoring out replay (#4850) `d639be0af` Replay: some code motion & reuse `ReplayCommittedState` (#4849) `78d6b6f7d` Update NativeAOT-LLVM infrastructure to current ABI (#4515) `d5c1738c1` Better module backtraces for panics and whatnot (#577) `6f23b19f3` Wait for database update to become durable (#4846) `81c9eab86` Add `spacetime lock/unlock` to prevent accidental database deletion (#4502) `809aebd7c` Move field `replay_table_updated` to `ReplayCommittedState` (#4807) `21b58ef99` Update axum (#2713) `b5cadff7a` Extract replay stuff out of `CommittedState`, part 1 (#4804) ``` I also updated the Python smoketests for breakage introduced in https://github.com/clockworklabs/SpacetimeDB/pull/4502. Reverting that PR caused conflicts, so this fix is more straightforward. # API and ABI breaking changes Maybe kind of, but we haven't released any of these. # Expected complexity level and risk 1 # Testing Ask @bfops about testing --------- Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2026-04-23 14:54:23 -07:00
Kim Altintop	6f23b19f36	Wait for database update to become durable (#4846 ) Confirmed reads applies only to subscription clients, calls to the the HTTP API publish endpoint return a success response before the operation is confirmed. While we await scheduling of a new database, updates require to wait for the update transaction to be confirmed. To allow this, the `TransactionOffset` channel and the database's `DurableOffset` need to be returned all the way up to the request handler. Note that waiting for confirmation is almost always the right choice, so can't be opted out of at the time of submission of this patch. Callers may, however, extend the timeout after which waiting for confirmation is cancelled.	2026-04-21 07:58:57 +00:00
clockwork-labs-bot	81c9eab86c	Add `spacetime lock/unlock` to prevent accidental database deletion (#4502 ) ## Motivation Feature request: "Is there any way we can lock a module to prevent it from being deleted? A bit concerned about some fat finger risk of accidentally deleting prod." ## Solution Adds a database lock mechanism. A locked database cannot be deleted until explicitly unlocked. ### New CLI Commands ```bash # Lock a database to prevent deletion spacetime lock my-database # Attempt to delete a locked database (fails with 403) spacetime delete my-database # Error: Database is locked and cannot be deleted. Run \`spacetime unlock\` first. # Unlock when you actually need to delete spacetime unlock my-database spacetime delete my-database ``` Both commands support `--server` and `--no-config` flags, and resolve the database from `spacetime.json` when no argument is given (same as `spacetime delete`). ### New HTTP API - `POST /v1/database/:name_or_identity/lock` -- Lock a database - `POST /v1/database/:name_or_identity/unlock` -- Unlock a database Both require the same authorization as `DELETE` (owner only). ### Implementation - Lock state stored in a separate `database_locks` sled tree in the standalone control DB (avoids changing the `Database` struct and needing a data migration) - `ControlStateReadAccess::is_database_locked()` and `ControlStateWriteAccess::set_database_lock()` added to the trait - `delete_database` route checks lock state before proceeding; returns `403 Forbidden` with a descriptive message if locked - Locking is idempotent (locking an already-locked database is a no-op, same for unlock) - Lock only prevents deletion, not publishing updates ### What is NOT locked - `spacetime publish` (updating module code) still works on locked databases - Only `spacetime delete` is blocked This matches the intent: protect prod from accidental destruction while allowing normal deployments. --------- Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>	2026-04-20 21:03:15 +00:00
Noa	21b58ef993	Update axum (#2713 ) # Description of Changes ~~Axum now has what we need out of it for a websocket wrapper, so we no longer need to duplicate `util/flat_csv.rs` and `util/websocket.rs`, meaning we have less code to maintain.~~ Nevermind, we know send raw frames, which axum's wrapper does not support. Makes this PR simpler. # Expected complexity level and risk 2 - upgrading a dependency is a risk, but looking through the [changelog](https://github.com/tokio-rs/axum/blob/main/axum/CHANGELOG.md#080) there isn't anything that should affect us. # Testing <!-- Describe any testing you've done, and any testing you'd like your reviewers to do, so that you're confident that all the changes work as expected! --> - [x] tests pass <!-- maybe a test you want to do --> - [ ] <!-- maybe a test you want a reviewer to do, so they can check it off when they're satisfied. --> Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>	2026-04-16 13:08:25 +00:00
clockwork-labs-bot	f672ae2273	Run client_connected hook for HTTP SQL requests (#4563 ) ## Summary The `/v1/database/:name_or_identity/sql` endpoint now calls the module's `client_connected` reducer before executing SQL, and `client_disconnected` after. This allows module authors to accept or reject SQL connections based on the caller's identity, matching the existing behavior of the `/call` endpoint. ## Motivation Previously, the `/sql` endpoint bypassed the module's `onConnect` hook entirely, meaning module authors had no way to restrict who could run SQL queries against their database. The `/call` endpoint already runs the connect hook, so this brings `/sql` to parity. ## Changes - `crates/client-api/src/routes/database.rs`: The `sql` handler now: 1. Generates a random connection ID 2. Calls `module.call_identity_connected()` before executing SQL 3. Executes the SQL query 4. Calls `module.call_identity_disconnected()` after 5. If `client_connected` rejects the connection, returns 403 Forbidden without executing the query - `sql_direct()` is unchanged since it is also used by the pgwire server, which has its own connection lifecycle. ## Behavior - If the module defines a `client_connected` reducer that throws/errors for a given identity, the SQL request returns `403 Forbidden` - If no `client_connected` reducer is defined, behavior is unchanged - The connection is always cleaned up via `client_disconnected` after the query completes --------- Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com> Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com> Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>	2026-04-15 08:56:43 -07:00
joshua-spacetime	7b3bc01d68	v3 websocket transport protocol (#4761 ) # Description of Changes The `v3` WebSocket API adds a thin transport layer around the existing `v2` message schema so that multiple logical `ClientMessage`s can be sent in a single WebSocket frame. The motivation is throughput. In `v2`, each logical client message requires its own WebSocket frame, which adds per-frame overhead in the client runtime, server framing/compression path, and network stack. High-throughput clients naturally issue bursts of requests, and batching those requests into a single frame materially reduces that overhead while preserving the existing logical message model. `v3` keeps the `v2` message schema intact and treats batching as a transport concern rather than a semantic protocol change. This lets the server support both protocols cleanly: - `v2` remains unchanged for existing clients - `v3` allows new clients to batch logical messages without changing reducer/procedure semantics - inner messages are still processed in order On the server side, this PR adds: - `v3.bsatn.spacetimedb` protocol support - `ClientFrame` / `ServerFrame` transport envelopes - decoding of inbound batched client frames into ordered `v2` logical messages - v3 outbound framing on the server side # API and ABI breaking changes None. `v2` clients continue to work unchanged. # Expected complexity level and risk 2 # Testing Testing will be included in the patches that update the sdk bindings	2026-04-14 19:54:22 +00:00
Shiven Garia	44ef7f4cd3	fix: Replace unwrap with proper error handling in WebSocket subscribe handler (#4696 ) # Description of Changes Replace `.unwrap()` with `.map_err(log_and_500)?` on the `get_database_by_identity()` call in the WebSocket subscribe handler (`crates/client-api/src/routes/subscribe.rs`). This was the only call site in the codebase using `.unwrap()` for this method — all four other call sites in `database.rs` already use `.map_err(log_and_500)?`. A transient database error during WebSocket connection setup would panic the server instead of returning an HTTP 500. Closes #4686 # API and ABI breaking changes None. # Expected complexity level and risk 1 — Single-token replacement matching the established pattern used everywhere else. # Testing - [ ] Verified the fix matches the error-handling pattern used at all other `get_database_by_identity` call sites - [ ] Confirmed `log_and_500` is already imported in the file	2026-04-10 15:56:54 +00:00
joshua-spacetime	05a4a7ba83	Replace `JsInstance` pool with single worker and FIFO queue (#4663 ) # Description of Changes Before this change, JS reducer requests borrowed a `JsInstance` from a pool. If no idle instance was available, we created another instance, which meant another V8 worker thread. Under load, this meant reducers bouncing across multiple OS threads. After this change, JS reducers go through a single long-lived `JsInstance` fed by a FIFO queue which results in much better cache locality. More accurately, each module now allocates a single OS thread, on which reducers (and most operations) run. Modules do not share workers/threads. And modules do not create multiple threads for running reducers. Note, the original instance pool is still used for procedures. It should probably be bounded, but I didn't make any changes to it. It's also used for executing views during initial subscription to avoid a reentrancy deadlock. The latter should be fixed and moved over to the JS worker thread at some point. # API and ABI breaking changes N/A # Expected complexity level and risk 4 # Testing ``` NODE_OPTIONS="--max-old-space-size=8192" \ MAX_INFLIGHT_PER_WORKER=512 \ BENCH_PRECOMPUTED_TRANSFER_PAIRS=1000000 \ pnpm bench test-1 --seconds 10 --concurrency 50 --alpha 1.5 --connectors spacetimedb ``` ``` 50K TPS -> 85K TPS on m2 mac ```	2026-03-24 00:13:26 +00:00
joshua-spacetime	047dac9745	Remove legacy SQL code (#4628 ) # Description of Changes This patch removes the dead legacy SQL query engine and the remaining code that only existed to support it. Removed: - Old SQL compiler/type-checker and VM-based execution path in spacetimedb-core - `spacetimedb-vm` crate - Dead vm specific error variants and compatibility code - Obsolete tests, benchmarks, and config paths that still referenced the legacy engine Small pieces still used by the current engine were moved to their proper homes instead of keeping the `vm` crate around. In particular, `RelValue` was moved to `spacetimedb-execution`. The `sqltest` crate was also updated to use the current engine. Notably though these tests are not run in CI, however I've kept them around as they may be beneficial as we look to expand our SQL support in the future. Requires codeowner review from @cloutiertyler due to the removal of the `LICENSE` file in the (now removed) `vm` crate. # API and ABI breaking changes None # Expected complexity level and risk 1 # Testing None --------- Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>	2026-03-19 17:47:00 +00:00
Shiven Garia	28cb32c284	fix: Replace unwrap with proper error handling in set_domains handler (#4643 ) Closes #4635 In `crates/client-api/src/routes/database.rs` line 1111, `lookup_database_identity` returns a `Result` but the error was handled with `.unwrap()`, which panics instead of returning an HTTP 500. Replaced it with `.map_err(log_and_500)?` to match the rest of the file. --------- Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com> Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2026-03-18 23:02:38 +00:00
Simon Gellis	18d4fbc320	Upgrade prometheus to 0.14.0 (#4598 ) # Description of Changes Upgrades the rust SDK's prometheus dependency from 0.13 to 0.14. Fixes https://github.com/clockworklabs/SpacetimeDB/issues/4597 # API and ABI breaking changes [The prometheus changelog](https://github.com/tikv/rust-prometheus/blob/master/CHANGELOG.md#0140) claims that the MSRV for the new version is 1.82, but this project doesn't seem to have an official MSRV, so I don't think that's an ABI change. I don't think depending on a different prometheus version is itself a breaking change. Prometheus is exposed through the rust SDK, but in an explicitly [unstable module](https://github.com/clockworklabs/SpacetimeDB/blob/3f58b5951bf3c49971c51aecb526439597b9c044/sdks/rust/src/lib.rs#L69-L76) which "may change incompatibly without a major version bump". Prometheus structs are also exposed from several crates, but with the same disclaimers about unstable interfaces. # Expected complexity level and risk 1. This is a module bump with simple-looking changes. # Testing - [x] Just confirmed everything still compiles and the tests still pass Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>	2026-03-11 21:58:35 +00:00
joshua-spacetime	408e54fd3b	Fix disconnects breaking view updates for other connections (#4607 ) # Description of Changes Fixes a bug in client disconnect logic that would mark a client's views as dropped(unsubscribed). However it was marking the identity's views as dropped, not the connection. So if an identity had multiple connections open, each subscribing to different views, and one of them disconnected, the subscriptions for the other connections would break. The observed behavior would be that they would stop receiving subscription updates. This could potentially lead to their client cache getting into a corrupted state. Now, instead of dropping all of the views for a particular identity on disconnect, we drop only the views for that particular connection. And when I say drop, what I really mean is decrement. A view is not dropped completely unless it no longer had any subscribers. # API and ABI breaking changes None # Expected complexity level and risk 2 # Testing Regression smoketest was added	2026-03-11 15:17:31 +00:00
clockwork-labs-bot	e991421009	Wait for database to load before returning schema (#4551 ) ## Summary When hitting `/v1/schema` while a database is still loading (replaying the log, running init reducers, etc.), the endpoint returned a 500 error because the module host was not yet available. ## Changes - Add `Host::wait_for_module(timeout)` in `crates/client-api/src/lib.rs` -- polls `get_module_host` with exponential backoff (100ms, 200ms, 400ms, 800ms, 1s, 1s, ...) up to the given timeout - Update the `/v1/schema` route to use `wait_for_module(10s)` instead of the immediate `module()` call If the database finishes loading within 10 seconds, the schema is returned normally. If it does not load in time, the existing 500 error is returned (same behavior as before, just delayed). No other routes are changed -- this is scoped to the schema endpoint per the issue description. Other routes (SQL, call, etc.) could adopt the same pattern if needed. Fixes clockworklabs/SpacetimeDBPrivate#2748 Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>	2026-03-04 16:55:48 +00:00
Phoebe Goldman	018575d1f9	Expose `RawModuleDefV10` via the HTTP schema route (#4540 ) # Description of Changes Add `?version=10` as an option to `/v1/database/:name-or-identity/schema`, where previously only `?version=9` was supported. This seems to have been forgotten when we introduced `RawModuleDefV10`. Also, in an unrelated minor fixup, fix a copy-paste error in a doc comment in the V2 WebSocket format definition. # API and ABI breaking changes Additive extension to HTTP API. # Expected complexity level and risk 1 # Testing - [x] Did a local get against this route and got a JSON-ified `RawModuleDefV10`: ```bash $ curl http://localhost:3000/v1/database/chat-console-rs/schema?version=10 {"sections":[{"Typespace":{"types":[{"Product":{"elements":[{"name":{"some":"sender"},"algebraic_type":{"Product":{"elements":[{"name":{"some":"__identity__"},"algebraic_type":{"U256":[]}}]}}},{"name":{"some":"sent"},"algebraic_type":{"Product":{"elements":[{"name":{"some":"__timestamp_micros_since_unix_epoch__"},"algebraic_type":{"I64":[]}}]}}},{"name":{"some":"text"},"algebraic_type":{"String":[]}}]}},{"Product":{"elements":[{"name":{"some":"identity"},"algebraic_type":{"Product":{"elements":[{"name":{"some":"__identity__"},"algebraic_type":{"U256":[]}}]}}},{"name":{"some":"name"},"algebraic_type":{"Sum":{"variants":[{"name":{"some":"some"},"algebraic_type":{"String":[]}},{"name":{"some":"none"},"algebraic_type":{"Product":{"elements":[]}}}]}}},{"name":{"some":"online"},"algebraic_type":{"Bool":[]}}]}}]}},{"Types":[{"source_name":{"scope":[],"source_name":"Message"},"ty":0,"custom_ordering":true},{"source_name":{"scope":[],"source_name":"User"},"ty":1,"custom_ordering":true}]},{"Tables":[{"source_name":"message","product_type_ref":0,"primary_key":[],"indexes":[],"constraints":[],"sequences":[],"table_type":{"User":[]},"table_access":{"Public":[]},"default_values":[],"is_event":false},{"source_name":"user","product_type_ref":1,"primary_key":[0],"indexes":[{"source_name":{"some":"user_identity_idx_btree"},"accessor_name":{"some":"identity"},"algorithm":{"BTree":[0]}}],"constraints":[{"source_name":{"some":"user_identity_key"},"data":{"Unique":{"columns":[0]}}}],"sequences":[],"table_type":{"User":[]},"table_access":{"Public":[]},"default_values":[],"is_event":false}]},{"Reducers":[{"source_name":"identity_connected","params":{"elements":[]},"visibility":{"Private":[]},"ok_return_type":{"Product":{"elements":[]}},"err_return_type":{"String":[]}},{"source_name":"identity_disconnected","params":{"elements":[]},"visibility":{"Private":[]},"ok_return_type":{"Product":{"elements":[]}},"err_return_type":{"String":[]}},{"source_name":"init","params":{"elements":[]},"visibility":{"Private":[]},"ok_return_type":{"Product":{"elements":[]}},"err_return_type":{"String":[]}},{"source_name":"send_message","params":{"elements":[{"name":{"some":"text"},"algebraic_type":{"String":[]}}]},"visibility":{"ClientCallable":[]},"ok_return_type":{"Product":{"elements":[]}},"err_return_type":{"String":[]}},{"source_name":"set_name","params":{"elements":[{"name":{"some":"name"},"algebraic_type":{"String":[]}}]},"visibility":{"ClientCallable":[]},"ok_return_type":{"Product":{"elements":[]}},"err_return_type":{"String":[]}}]},{"LifeCycleReducers":[{"lifecycle_spec":{"Init":[]},"function_name":"init"},{"lifecycle_spec":{"OnConnect":[]},"function_name":"identity_connected"},{"lifecycle_spec":{"OnDisconnect":[]},"function_name":"identity_disconnected"}]},{"ExplicitNames":{"entries":[{"Table":{"source_name":"message","canonical_name":"message"}},{"Table":{"source_name":"user","canonical_name":"user"}},{"Function":{"source_name":"identity_connected","canonical_name":"identity_connected"}},{"Function":{"source_name":"identity_disconnected","canonical_name":"identity_disconnected"}},{"Function":{"source_name":"init","canonical_name":"init"}},{"Function":{"source_name":"send_message","canonical_name":"send_message"}},{"Function":{"source_name":"set_name","canonical_name":"set_name"}}]}}]} ```	2026-03-04 05:21:38 +00:00
Noa	e3582131fe	Migrate to Rust 2024 (#3802 ) # Description of Changes It'd be best to review this commit-by-commit, and using [difftastic](https://difftastic.wilfred.me.uk) to easily tell when changes are minor in terms of syntax but a line based diff doesn't show that. # Expected complexity level and risk 3 - edition2024 does bring changes to drop order, which could cause issues with locks, but I looked through [all of the warnings that weren't fixed automatically](https://gistcdn.githack.com/coolreader18/80485ae5c5f82de1784229cce2febb26/raw/ba80f3fecda66ceb34f4f7ad73b98ea02d4893a2/warnings.html) and couldn't find any issues. # Testing n/a; internal code change	2026-03-03 11:06:52 +00:00
Zeke Foppa	2b85157d18	Confirmed reads default only for v2 connections (#4419 ) # Description of Changes Reducing scope of https://github.com/clockworklabs/SpacetimeDB/pull/4390 to only apply to V2 clients. # API and ABI breaking changes I think this is an API change? # Expected complexity level and risk 1 # Testing <!-- Describe any testing you've done, and any testing you'd like your reviewers to do, so that you're confident that all the changes work as expected! --> - [ ] <!-- maybe a test you want to do --> - [ ] <!-- maybe a test you want a reviewer to do, so they can check it off when they're satisfied. --> Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2026-02-23 23:27:12 -08:00
clockwork-labs-bot	b002158db8	Enable confirmed reads by default (#4390 ) ## Summary Enable confirmed reads by default for all WebSocket subscriptions and SQL queries. This is a 2.0 breaking change that improves data integrity. ### What changed Previously, subscription updates and SQL results were sent to clients immediately, before the transaction was confirmed durable. A server crash could cause clients to have observed data that was lost. Now the server defaults to `confirmed=true`. Clients receive updates only after durability is confirmed. This adds a small latency cost but guarantees that any data a client receives will survive a server restart. ### Changes Server (2 files, 2 lines each): - `subscribe.rs`: `SubscribeQueryParams.confirmed` defaults to `true` - `database.rs`: `SqlQueryParams.confirmed` defaults to `true` Documentation: - Migration guide updated with "Confirmed Reads Enabled by Default" section - Added to overview list and quick migration checklist ### Opt-out Clients can opt out by explicitly passing `?confirmed=false` in the WebSocket URL or using `.withConfirmedReads(false)` / `.WithConfirmedReads(false)` / `.with_confirmed_reads(false)` in SDKs. ### Smoketest impact Smoketests that don't explicitly pass `--confirmed` will now get confirmed reads via the server default. This should not cause failures -- confirmed reads only add a small wait for durability confirmation before sending results. The `confirmed_reads.py` smoketest explicitly passes `--confirmed` and continues to work as before. ### SDK impact No SDK changes needed. SDKs only send the `confirmed` query parameter when explicitly set by the user. When not set, the server default applies -- which is now `true`. --------- Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com> Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com> Co-authored-by: clockwork-labs-bot <bot@clockworklabs.com> Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com> Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2026-02-24 02:27:42 +00:00
joshua-spacetime	919908fb70	Trigger view refresh when a WASM procedure commits a transaction (#4301 ) # Description of Changes This PR fixes the bug where WASM procedures could commit a transaction without refreshing affected materialized views, which caused view-backed subscriptions to miss updates from procedure writes. The equivalent changes for V8 will be made in a separate patch. # API and ABI breaking changes None # Expected complexity level and risk 3 This was not a simple translation of the reducer code path, because reducers are run in a single transaction whereas procedures can have multiple transactions via `with_tx` and views must be refreshed with each transaction instead of once at the end of the procedure. This required refresh to be inserted into the syscall itself which required some extra plumbing. Mainly that we had to attach the validated `ModuleDef` to the `WasmInstanceEnv`. This should not affect hotswapping because we instantiate an entirely new `WasmInstanceEnv` in that case. # Testing - [x] Added a regression test in the form of a smoketest that subscribes to a view and calls a procedure	2026-02-16 21:16:35 +00:00
Tyler Cloutier	184d4e9d3f	Implement server-side support for the v2 websocket protocol (#4213 ) # Description of Changes This adds the v2 websocket protocol and adds support on the server side. For context on many of the changes/decisions, you can look at the discussion on https://github.com/clockworklabs/SpacetimeDB/pull/4023. To restate some of the key changes: - The reducer event information is no longer sent with transaction updates (because we don't want to broadcast reducer call information anymore). - If a client calls a reducer, they are sent a `ReducerResult` which includes the outcome of the reducer call and and related row updates for queries that the client is subscribed to. - We no longer dedupe queries that appear in multiple query sets for the same client. This is because we are moving toward per-query storage. - Related to that, Unsubscribe requests have an option to send the related rows. We need this for now, since clients don't have per-query storage implemented yet. - We don't have the json format in v2. Notes for reviewers: - This moves around the messages in `crates/client-api-messages/src/websocket` (into `common`, `v1`, and `v2`), and this renaming of existing messages adds a lot of noise to the PR. - In many places, I chose to duplicate a lot of code to have a v1 version and a v2 version. I went with this to make it easier to remove the v1 version in the future (hopefully we can just fully delete most of the v1 functions). - `module_subscription_manager.rs` has probably has the biggest changes, since we now track queries by query_set_id, and we get to remove some complexity of v1's FormatSwitch. <!-- Please describe your change, mention any related tickets, and so on here. --> # API and ABI breaking changes The v1 protocol still works, though we won't send the reducer event info for v10 modules. # Expected complexity level and risk 4. This touches a lot of places. # Testing Unit testing is pretty minimal for the new code paths. I've done some manual e2e testing with the typescript quickstart, and this has been tested with a different branch implementing the v2 rust client. --------- Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org> Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>	2026-02-12 20:39:26 +00:00
Kim Altintop	914338457e	client-api: Resolve organization names via the tld, not the dns table (#4266 ) As specified in the proposal.	2026-02-12 12:00:22 +00:00
joshua-spacetime	e1769b5a24	Add warning prompt for 1.0 -> 2.0 module upgrade path (#4247 ) # Description of Changes Adds a warning prompt for 1.0 -> 2.0 module upgrade path. # API and ABI breaking changes None # Expected complexity level and risk 1 # Testing This patch checks a wasm module binary compiled with pre-2.0 bindings into source control. A smoketest was added that first publishes the the pre-compiled module and then publishes a new module using the 2.0 bindings in its place.	2026-02-11 18:56:08 +00:00
Kim Altintop	46f572b7a2	Organizations (#4087 ) Implements the client-api changes for the organizations feature. The interesting part of this PR are the `teams` smoketests. There is one subtlety with deletion. Note that I haven't (yet) covered the interaction between collaborators and organization existing for the same database.	2026-01-30 09:04:49 +00:00
Mazdak Farrokhzad	3d3c99f8db	Shrink `JsWorkerRequest` & use the right HashMap/Set (#4150 ) # Description of Changes This PR shrinks `JsWorkerRequest` so that it is (almost) as small as the call reducer request. To do that, a bunch of trivial changes had to be done to auth code, that mostly revolves around `String` -> `Box<str>`. This should help the auth code, but that is incidental. The main goal was to improve throughput through the request tx/rx channel for V8, which is taking quite a bit of time in flamegraphs. I also noticed while making this change that the wrong hash map was being used in a bunch of places, so I fixed all of those. A follow up PR will shrink the reply side to fit within a cache line. Yet another follow up PR will change the channel to replace flume with `fibre::spsc`. # API and ABI breaking changes None # Expected complexity level and risk 2, fairly trivial changes. # Testing Covered by existing tests.	2026-01-29 08:46:09 +00:00
Mazdak Farrokhzad	2fdbb3128f	Shrink `JsWorkerReply` to 48, making replies fit in a cache line (#4151 ) # Description of Changes See tin. This helps out with throughput for the V8 reply rx/tx channel. # API and ABI breaking changes None # Expected complexity level and risk 1 # Testing Covered by existing tests.	2026-01-28 17:26:11 +00:00
Kim Altintop	ef61c7c123	In-memory DatabaseLogger (#3961 ) This is the second step to make in-memory-only databases not touch the disk at all. While at it, also make it so file-backed module logs are streamed in constant memory where possible. Depends-on: #3912 # Expected complexity level and risk 2 # Testing Added some unit-level tests. --------- Signed-off-by: Kim Altintop <kim@eagain.io> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2026-01-16 07:10:04 +00:00
Mario Montoya	038622227d	Make /v1/database/:name/call/:func call procedures too, remove procedure route (#3883 ) # Description of Changes Closes #3659 # API and ABI breaking changes Remove route and alter the semantics of the `call` route on both server and `cli` # Expected complexity level and risk 1 # Testing - [x] Publish module with `procedures` and observe calling the `cli` the result is print. --------- Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>	2025-12-31 23:31:02 +00:00
Shubham Mishra	10fd8b2cd0	fix view deadlock (#3938 ) # Description of Changes Fixes a deadlock in the subscription code and HTTP SQL handler that was caused by calling view methods on the module while holding the transaction lock. I tried a couple of approaches to make the closures `Send` for all code paths that need to hold the transaction while working with views, but that didn’t work out well. The V8 module communicates with the host through channels, which would require dynamic dispatch. In the current approach, all existing methods that were calling views from the host are now invoked from inside the module itself. In future, It will be better to move these methods to common place rather than being scattrered.	2025-12-30 20:02:15 +00:00
Mazdak Farrokhzad	8e3af49f64	Reuse buffers in `ServerMessage<BsatnFormat>` (#2911 ) # Description of Changes Fixes https://github.com/clockworklabs/SpacetimeDB/issues/2824. Defines a global pool `BsatnRowListBuilderPool` which reclaims the buffers of a `ServerMessage<BsatnFormat>` and which is then used when building new `ServerMessage<BsatnFormat>`s. Notes: 1. The new pool `BsatnRowListBuilderPool` reports the same kind of metrics to prometheus as `PagePool` does. 2. `BsatnRowListBuilder` now works in terms of `BytesMut`. 3. The trait method `fn to_bsatn_extend` is redefined to be capable of dealing with `BytesMut` as well as `Vec<u8>`. 4. A trait `ConsumeEachBuffer` is defined from `ServerMessage<BsatnFormat>` and down to extract buffers. `<ServerMessage<_> as ConsumeEachBuffer>::consume_each_buffer(...)` is then called in `messages::serialize(...)` just after bsatn-encoding the entire message and before any compression is done. This is the place where the pool reclaims buffers. # Benchmarks Benchmark numbers vs. master using `cargo bench --bench subscription -- --baseline subs` on i7-7700K, 64GB RAM: ``` footprint-scan time: [21.607 ms 21.873 ms 22.187 ms] change: [-62.090% -61.438% -60.787%] (p = 0.00 < 0.05) Performance has improved. full-scan time: [22.185 ms 22.245 ms 22.324 ms] change: [-36.884% -36.497% -36.166%] (p = 0.00 < 0.05) Performance has improved. ``` The improvements in `footprint-scan` are mostly thanks to https://github.com/clockworklabs/SpacetimeDB/pull/2918, but 7 ms of the improvements here are thanks to the pool. The improvements to `full-scan` should be only thanks to the pool. # API and ABI breaking changes None # Expected complexity level and risk 2? # Testing - Tests for `Pool<T>` also apply to `BsatnRowListBuilderPool`.	2025-12-18 23:02:36 +00:00
Kim Altintop	a2c434141d	client-api: Pause time in websocket timeout tests (#3896 ) Using `#[tokio::test(start_paused = true)]` pauses time, yet tokio will still advance it when encountering `sleep`s while it has no other work to do. This makes the tests that rely on timeouts deterministic and should prevent those tests from becoming flaky on busy machines. # Expected complexity level and risk 2 # Testing This modifies tests. It does appear to work as described, but it can't hurt if the reviewers convince themselves that it does indeed.	2025-12-18 10:17:41 +00:00
Kim Altintop	16f1b2c1fe	client-api: Deny changing the parent of an existing database (#3837 ) Mainly a smoketest to exercise the intended behaviour. Also return an error if we end up delegating to the reset database endpoint, which itself doesn't accept a `parent` parameter.	2025-12-09 18:35:48 +00:00
Kim Altintop	062649c92e	client-api: Send WebSocket messages fragmented (#2931 ) RFC 6455, Section 5.4 describes message fragmentation, and we can do that with tungstenite. It does seem to help getting control messages (ping, pong, close) through without head-of-line blocking. # Expected complexity level and risk 2 - Need to test with clients # Testing TBD - some more abstraction is needed due to the difficulty of synthetically producing a large outgoing message.	2025-12-09 09:21:11 +00:00
Kim Altintop	a959996ba7	Debug "stuck module" issue (#3813 ) Adds some logging and times out lock acquisition attempts in the host controller. Should help debugging clockworklabs/SpacetimeDBPrivate#2337	2025-12-04 20:23:32 +00:00
Phoebe Goldman	b17e0dbe6c	Rename the `/database/procedure` route to be `unstable` (#3723 ) # Description of Changes Title. Closes #3644 . # API and ABI breaking changes Moves an unreleased HTTP route to make it explicitly unstable. # Expected complexity level and risk 1 # Testing <!-- Describe any testing you've done, and any testing you'd like your reviewers to do, so that you're confident that all the changes work as expected! --> - [x] Used `curl` to hit the previous, not-unstable-looking route, got a 404. - [x] Used `curl` to hit the new, obviously-unstable route, got a proper response.	2025-11-24 23:27:35 +00:00
Tyler Cloutier	6d7b0d87ce	Added staging to allowable issuers (#3714 ) # Description of Changes This modifies the environment variable that we pass in for requiring SpacetimeAuth to publish so that we can put different issuer values for live and staging. # API and ABI breaking changes This requires a new environment variable or it skips this requirement. # Expected complexity level and risk 1 # Testing I have not tested this change locally. --------- Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>	2025-11-20 20:32:21 +00:00
Julien Lavocat	b13f12dac0	Add extra claims to v1/identity/websocket-token (#3705 ) # Description of Changes Due to a limitation around passing headers to a WebSocket connection, The typescript SDK rely on the endpoint `/v1/identity/websocket-token` to get a new, short-lived token. Currently, this endpoint strips all the other claims from the token and only returns the following claims: - `hex_identity` - `sub` - `iss` - `aud` - `iat` - `exp` This PR aims to fix this issue by introducing a new member field `extra` to `SpacetimeIdentityClaims` and `TokenClaims` and letting serde do its job. # API and ABI breaking changes None # Expected complexity level and risk 2 - The change is trivial (1) but I'm not 100% familiar with all the places where we would be signing a token (1). # Testing 1. `curl` the endpoint and checking that the token returned contains all the expected claims 2. Check that that the endpoint `v1/identity` still correctly issues and identity and token --------- Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>	2025-11-20 18:17:50 +00:00
Mazdak Farrokhzad	0a3251708a	Add `ProcedureContext::with_tx` (#3638 ) # Description of Changes Adds `ProcedureContext::{with_tx, try_with_tx}`. Fixes https://github.com/clockworklabs/SpacetimeDB/issues/3515. # API and ABI breaking changes None # Expected complexity level and risk 2 # Testing An integration test `test_calling_with_tx` is added.	2025-11-19 20:33:02 +00:00
Tyler Cloutier	9e3ffeb932	Adds flag to publish to allow clearing on migration conflict. (#3601 ) # Description of Changes This PR modifies the `--delete-data` flag on `spacetime publish` and adds the `--delete-data` flag on `spacetime dev`. In particular instead of `--delete-data` being a boolean, it is now a an enum: - `always` -> corresponds to the old value of `true` - `never` -> corresponds to the old value of `false` - `on-conflict` -> clears the database, but only if publishing would have required a manual migration This flag does NOT change any behavior about prompting users to confirm if they want to delete the data. Users will still be prompted to confirm UNLESS they pass the separate `--yes` flag. `spacetime dev` gets the same `--delete-data` flag. The default value of `never` is equivalent to the existing behavior. `spacetime dev` continues to publish with `--yes` just as before. This behavior is unchanged. # API and ABI breaking changes Adds the flags specified above. This is NOT a breaking change to the CLI. Passing `--delete-data` is the equivalent of `--delete-data=always`. This IS technically a breaking change to the `pre_publish` route. As far as I'm aware this is only used by our CLI however. > IMPORTANT SIDE NOTE: I would argue that `--break-clients` should really be renamed to `--yes-break-clients` because it actually behaves like the `--yes` force flag, but only for a subset of the user prompts. I have not made this change because it would be a breaking change, but if the reviewers agree, I will make this change. # Expected complexity level and risk 2, Very small change, but if we get it wrong users could accidentally lose data. I would ask reviewers to think about ways that users might accidentally pass `--delete-data --yes`. # Testing - [ ] I have not yet tested manually. --------- Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com> Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com> Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com> Co-authored-by: John Detter <4099508+jdetter@users.noreply.github.com>	2025-11-19 17:38:49 +00:00
Kim Altintop	dedbfb4958	client-api: Make `ControlStateReadAccess` an async trait (#3357 ) Historically, controldb reads could be treated as just a datastructure, but that became a lie when reconnections were introduced. Some tricks were employed to enter the async context when needed, but those always bear the risk of introducing a deadlock somewhere. So, just make it async. # Expected complexity level and risk 2 It may or may not be safe to use sled in an async context. We did already for the write path, but if it's a problem it'll show now. # Testing Not a functional change.	2025-11-14 07:58:50 +00:00
Kim Altintop	3ea5fdea20	[teams 5/5] Identity routes (#3526 ) Introduces a "routes struct" for the `/identity` endpoints, much like the `DatabaseRoutes`. This is useful for overriding individual handlers. See companion for motivation. Depends-on: #3525	2025-11-12 15:40:55 +00:00
Kim Altintop	310d8eb7ae	[teams 4/5] SQL authorization (#3525 ) Permissions for evaluating SQL/DML are not generally "actions", but more a set of permissions that are checked during evaluation. To make this work with the teams feature, this patch extends `AuthCtx` to allow checking a set of permissions as mandated by the spec. This set is a bit more fine-grained than "is owner", so as to avoid baking in the concept of teams/collaborators, or assumptions about what a role might entail. Both are likely to evolve in the future, so evaluation of permissions / capabilities should be confined to the impl of the `Authorization` trait. Unlike "actions", the `AuthCtx` must be able to evaluate permission checks quickly and without side-effects, nor can it enter an `async` context. In that sense, it is precomputed (if you will), and stored as a closure in the `AuthCtx` for external authorization. A challenge posed is how to thread through the constructed `AuthCtx` for subscriptions. A tempting approach would have been to equip the `HostController` with the ability to summon an `AuthCtx`. That, however, would have created a gnarly circular dependency, because the `HostController` also controls the controldb, which itself demands an `AuthCtx`. Instead, the `AuthCtx` is obtained in the endpoint handler and passed to each method call that requires one. That's less pretty, but more effective. --------- Signed-off-by: Kim Altintop <kim@eagain.io> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2025-11-11 20:19:16 +00:00
Kim Altintop	a36f7091d5	[teams 3/5] API authorization, CLI, smoketests (#3523 ) This adds authorization to the relevant API endpoints, updates the CLI commands and adds smoketests for the teams feature. Note: Authorizing SQL (incl. subscriptions) is a bit more involved, and submitted as a separate PR in the series. Depends-on: https://github.com/clockworklabs/SpacetimeDB/pull/3519	2025-11-11 14:10:58 +00:00
Kim Altintop	99882ad436	[teams 2/5] client-api: Add `parent` parameter to publish endpoint (#3519 ) This is the minimal patch to implement the private controldb side of the teams feature. The parameter is ignored for now. Depends-on: https://github.com/clockworklabs/SpacetimeDB/pull/3496	2025-11-11 10:26:41 +00:00
Kim Altintop	e0b8e6f265	[teams 1/5] Reset database (#3611 ) So far, the `--clear-database` option to `publish` has simply dropped and then re-created the database (if it did exist). This will no longer work when databases can have "children": because dropping and re-creating is not atomic, children would either become orphans, or be dropped as well. To solve this, `reset_database` is introduced as a separate action that: - shuts down all replicas - if a `program_bytes` is supplied, replaces the database's initial program - if a `host_type` is supplied, replaces the database's host type - starts `num_replicas` or the previous number of replicas, which initialize themselves as normal As this could be its own CLI command, the action is provided as its own API endpoint (undocumented). However, since `publish` has no way of knowing whether the database it operates on actually exists, the `publish_database` handler will just invoke the `reset_database` handler if `clear=true` and the database exists, and return its result. This is to avoid starting the transfer of the program in the request body, only to receive a redirect. Some refactoring was necessary to dissect and understand the flow. # API and ABI breaking changes Introduces a new, undocumented API endpoint. We may want to nest it under `/unstable`. # Expected complexity level and risk 2 # Testing From the outside, the observed behavior should be as before, so smoketests should cover it.	2025-11-11 08:39:24 +00:00
joshua-spacetime	edac806697	Materialize views on subscribe (#3599 ) # Description of Changes This patch: 1. Materializes views on subscribe and sql calls by invoking `call_view` on the `ModuleHost`. 2. Downgrades to a read-only transaction after view materialization but before query execution. 3. Updates the `st_view_sub` system table on both subscribe and unsubscribe. 4. Makes subscribe methods on the SubscriptionManager async. # API and ABI breaking changes None # Expected complexity level and risk 2 # Testing End-to-end tests to be added with atomic view updates	2025-11-08 22:47:08 +00:00
Shubham Mishra	75c6e67c3c	Views: Host interface for WASM modules (#3548 ) # Description of Changes Host implementation to invoke `call_view` method. I also covers: 1. API `MutTxId::is_materialized`to check if existing view exisits and updated. 2. Update in readsets logic to remove stale views. 3. sql caller implmentation. # API and ABI breaking changes NA How complicated do you think these changes are? Grade on a scale from 1 to 5, where 1 is a trivial change, and 5 is a deep-reaching and complex change. 3	2025-11-06 21:14:00 +00:00
joshua-spacetime	30b8eaccf1	Decrement view subscriber count on disconnect (#3547 ) # Description of Changes Refactored `st_view_client` and renamed it `st_view_sub` which tracks the number of clients subscribed to a view. On disconnect, we decrement the `num_subscribers` column in the appropriate rows. An async task will be in charge of cleaning up views (and their read sets) whose subscriber count has gone to zero (not in this patch). On module init, we clear the entirety of each view table. # API and ABI breaking changes None. Technically this updates the schema of a system table, but the system table was added and modified between releases. # Expected complexity level and risk ~2 Need to make sure we cover all cases so that we don't leave dangling data. Making these tables ephemeral in the future should simplify this. # Testing Will add tests once we can subscribe to views	2025-11-06 07:55:17 +00:00
Zeke Foppa	09828ee954	Revert "[teams 1/5] Reset database (#3496 )" (#3580 ) # Description of Changes This reverts commit #3496. # API and ABI breaking changes Technically maybe yes? But definitely nothing is using the new code yet. # Expected complexity level and risk 1 # Testing CI only Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2025-11-05 20:23:32 +00:00
Kim Altintop	5c42b091aa	[teams 1/5] Reset database (#3496 ) So far, the `--clear-database` option to `publish` has simply dropped and then re-created the database (if it did exist). This will no longer work when databases can have "children": because dropping and re-creating is not atomic, children would either become orphans, or be dropped as well. To solve this, `reset_database` is introduced as a separate action that: - shuts down all replicas - if a `program_bytes` is supplied, replaces the database's initial program - if a `host_type` is supplied, replaces the database's host type - starts `num_replicas` or the previous number of replicas, which initialize themselves as normal As this could be its own CLI command, the action is provided as its own API endpoint (undocumented). However, since `publish` has no way of knowing whether the database it operates on actually exists, the `publish_database` handler will just invoke the `reset_database` handler if `clear=true` and the database exists, and return its result. This is to avoid starting the transfer of the program in the request body, only to receive a redirect. Some refactoring was necessary to dissect and understand the flow. # API and ABI breaking changes Introduces a new, undocumented API endpoint. We may want to nest it under `/unstable`. # Expected complexity level and risk 2 # Testing From the outside, the observed behavior should be as before, so smoketests should cover it.	2025-11-05 10:55:28 +00:00

1 2 3 4 5

224 Commits