## Summary
The `/v1/database/:name_or_identity/sql` endpoint now calls the module's
`client_connected` reducer before executing SQL, and
`client_disconnected` after. This allows module authors to accept or
reject SQL connections based on the caller's identity, matching the
existing behavior of the `/call` endpoint.
## Motivation
Previously, the `/sql` endpoint bypassed the module's `onConnect` hook
entirely, meaning module authors had no way to restrict who could run
SQL queries against their database. The `/call` endpoint already runs
the connect hook, so this brings `/sql` to parity.
## Changes
- `crates/client-api/src/routes/database.rs`: The `sql` handler now:
1. Generates a random connection ID
2. Calls `module.call_identity_connected()` before executing SQL
3. Executes the SQL query
4. Calls `module.call_identity_disconnected()` after
5. If `client_connected` rejects the connection, returns 403 Forbidden
without executing the query
- `sql_direct()` is unchanged since it is also used by the pgwire
server, which has its own connection lifecycle.
## Behavior
- If the module defines a `client_connected` reducer that throws/errors
for a given identity, the SQL request returns `403 Forbidden`
- If no `client_connected` reducer is defined, behavior is unchanged
- The connection is always cleaned up via `client_disconnected` after
the query completes
---------
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
# Description of Changes
Adds some docs to the C# bindings packages.
# API and ABI breaking changes
None
# Expected complexity level and risk
0
# Testing
N/A
---------
Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
# Description of Changes
Fixed a CLI call in docs Unity tutorial that generated client module
code in an incorrect directory.
# API and ABI breaking changes
None.
# Expected complexity level and risk
1
# Testing
Built a project by following the tutorial with the wrong call vs the
fixed call. Compared against different parts of the tutorial to confirm
the intent.
---------
Signed-off-by: T-Podgorski <147391857+T-Podgorski@users.noreply.github.com>
Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
# Description of Changes
Moves stuff out of `datastore.rs` and `committed_state.rs` that has to
do with replay into its own module `replay.rs`.
This also shaves 64 bytes off of `CommittedState` so that we don't
continue carrying the cost of replay after it has finished.
This is a more limited version / part 1 of
https://github.com/clockworklabs/SpacetimeDB/pull/4055 which focuses on
almost pure code motion so that we're more diligent and careful about
the transition. Future PRs will continue with other parts of #4055.
# API and ABI breaking changes
None
# Expected complexity level and risk
1, just simple and safe code motion.
# Testing
Covered by existing tests as this is code motion.
# Description of Changes
A variety of small expansions/fixes to the smoketests lib.
Also, several of the tests were not working when `--server` was passed,
due to name collisions. Each test runs with its own identity, but if it
tried to publish over an existing database, it would get an auth error.
I have fixed the name collisions. However, the original python
smoketests simply used random strings, where we currently use hardcoded
names and/or hardcoded names with PID suffixes. This seems less robust
and should potentially be changed.
# API and ABI breaking changes
None. Testing only.
# Expected complexity level and risk
2
# Testing
- [x] CI passes
---------
Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
# Description of Changes
From the perspective of a new client SDK or module library developer,
with a focus on the SDK test suite.
# API and ABI breaking changes
N/a - it's internal docs.
# Expected complexity level and risk
0 - it's internal docs.
# Testing
N/a - it's internal docs.
---------
Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: Mario Montoya <mamcx@elmalabarista.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
# Description of Changes
* Fixes#3770.
# API and ABI breaking changes
Technically breaks certain PG Wire responses by fixing the incorrect
behavior.
# Expected complexity level and risk
### 2:
Should not cause any actual breaking, as it just changes the response
type from empty rows to execution metrics where there can never be a
meaningful response.
# Testing
- [x] Manual testing
Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
## Summary
Fixes#3934. Removing or changing a `#[primary_key]` annotation succeeds
on the first re-publish, but the stored schema's `table_primary_key`
field in `st_table` is never updated. On the **next** publish,
`check_compatible` fails with:
```
Primary key mismatch: self.primary_key: Some(ColId(0)), def.primary_key: None
```
## Root Cause
`auto_migrate_table` handles removing the PK's index and unique
constraint, but there was no `AutoMigrateStep` to update the
`table_primary_key` field in the system table.
## Fix
- Add `AutoMigrateStep::ChangePrimaryKey` variant to the auto-migration
planner
- Detect `old.primary_key != new.primary_key` in `auto_migrate_table`
and emit the step
- Add `alter_table_primary_key` to the datastore layer (`mut_tx`,
`datastore`, `relational_db`) with proper rollback support via
`PendingSchemaChange::TableAlterPrimaryKey`
- Handle the step in `auto_migrate_database`
- Add migration plan formatter support (termcolor output)
## Repro Script
```bash
# 1. Publish with primary key
spacetime publish repro --yes # table has #[primary_key] on name
# 2. Remove primary key — succeeds
spacetime publish repro --yes # removed #[primary_key]
# 3. Any change — CRASHES
spacetime publish repro --yes # Primary key mismatch error
```
## Tests
- **Unit test** in `crates/core/src/db/update.rs`: Reproduces the exact
three-publish sequence from the issue (create table with PK → remove PK
→ trivial change)
- **Smoketest** in
`crates/smoketests/tests/smoketests/auto_migration.rs`: Full end-to-end
publish flow exercising the same scenario
## Files Changed
| File | Change |
|------|--------|
| `crates/schema/src/auto_migrate.rs` | Add `ChangePrimaryKey` variant +
detection |
| `crates/schema/src/auto_migrate/formatter.rs` | Format the new step |
| `crates/schema/src/auto_migrate/termcolor_formatter.rs` | Colored
output |
| `crates/datastore/src/locking_tx_datastore/tx_state.rs` |
`TableAlterPrimaryKey` pending change |
| `crates/datastore/src/locking_tx_datastore/mut_tx.rs` |
`alter_table_primary_key` |
| `crates/datastore/src/locking_tx_datastore/datastore.rs` | Expose
through datastore |
| `crates/datastore/src/locking_tx_datastore/committed_state.rs` |
Rollback support |
| `crates/core/src/db/relational_db.rs` | Expose through RelationalDB |
| `crates/core/src/db/update.rs` | Handle step + unit test |
| `crates/smoketests/.../auto_migration.rs` | Smoketest |
---------
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
The previous approaches would either:
- panic when the queue becomes full, as `append_tx` is run inside the
context of a `LocalSet`, which is basically a glorified current thread
runtime
- deadlock because the receiver runtime has no way of notifiying the
sender of freed capacity in the channel
`async-channel` handles wait queues and notifications internally, so can
be used freely from either blocking or async contexts.
This _may_ come at different performance characteristics, but I haven't
measured them.
---------
Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
# Description of Changes
The `v3` WebSocket API adds a thin transport layer around the existing
`v2` message schema so that multiple logical `ClientMessage`s can be
sent in a single WebSocket frame.
The motivation is throughput. In `v2`, each logical client message
requires its own WebSocket frame, which adds per-frame overhead in the
client runtime, server framing/compression path, and network stack.
High-throughput clients naturally issue bursts of requests, and batching
those requests into a single frame materially reduces that overhead
while preserving the existing logical message model.
`v3` keeps the `v2` message schema intact and treats batching as a
transport concern rather than a semantic protocol change. This lets the
server support both protocols cleanly:
- `v2` remains unchanged for existing clients
- `v3` allows new clients to batch logical messages without changing
reducer/procedure semantics
- inner messages are still processed in order
On the server side, this PR adds:
- `v3.bsatn.spacetimedb` protocol support
- `ClientFrame` / `ServerFrame` transport envelopes
- decoding of inbound batched client frames into ordered `v2` logical
messages
- v3 outbound framing on the server side
# API and ABI breaking changes
None. `v2` clients continue to work unchanged.
# Expected complexity level and risk
2
# Testing
Testing will be included in the patches that update the sdk bindings
## Summary
- require an explicit confirmation before `spacetime delete` sends an
irreversible delete request
- preserve `--yes` as the non-interactive override
- add smoketests for aborting on `n`, deleting on `y`, and skipping the
prompt with `--yes`
Closes#4679.
## History note
`spacetime delete` has never required confirmation for ordinary deletes.
The command was added in commit `44df6c6e7` (`Initial commit`,
2023-08-01) without a prompt. Later, commit `a36f7091d` (`[teams 3/5]
API authorization, CLI, smoketests`, 2025-11-11) added a confirmation
flow only for the special case where deleting a database would also
delete its children, but plain deletes still executed immediately until
this change.
## Testing
Added new smoketests
---------
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
## Summary
- add a `db_names` column to `spacetime list`
- keep the database identity column too
- use reverse-DNS lookups to show all known names for each database
- degrade gracefully if a reverse lookup fails instead of failing the
whole command
Closes#1046.
## Notes
I looked at #1072 for prior art. This version is adapted to the current
`spacetime list` implementation and uses the existing
`util::spacetime_reverse_dns` helper rather than calling the endpoint
directly.
If a database has no names, the command prints `(unnamed)`.
If a reverse-DNS lookup fails, the command prints `(lookup failed)` for
that row and warns on stderr.
## Testing
- `cargo check -p spacetimedb-cli`
---------
Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
# Description of Changes
Record metrics periodically in batches to avoid the database execution
thread waking up a parked task on every metrics `send`.
# API and ABI breaking changes
N/A
# Expected complexity level and risk
1
# Testing
Benchmarked manually. Will add relevant benchmarks to CI in a separate
patch.
# Description of Changes
Previously we would invoke V8's `GetHeapStatistics` on every reducer
call in order to compute the `memory_allocation` part of its
`ExecutionStats`. We already sample heap statistics periodically, so now
we cache the result after each sample and just read the cached value
after each call. Memory allocation is used for energy tracking and so
eventual consistency is fine.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Refactor/optimization
---------
Signed-off-by: joshua-spacetime <josh@clockworklabs.io>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
# Description of Changes
Before this change, local durability had two workers. The first worker
reordered slightly out-of-order submissions by `TxOffset` and
materialized the commitlog payload, and then forwarded the payload to
the 2nd worker which wrote to the commitlog.
The problem was that the first worker did such little work that it was
mostly parked, and so every transaction would pay the cost of waking it
up. And for very high throughput workloads, this cost was significant
and on the critical path - the database thread.
This patch merges the two workers so that the database thread feeds a
mostly non-idle task.
# API and ABI breaking changes
N/A
# Expected complexity level and risk
I'll call it a `3` mainly because it touches the durability layer in
some capacity, although overall it's a simplification since it removes
the transaction reordering logic that previously existed to paper over a
race condition that @kim had originally fixed in #4661.
# Testing
Pure refactor
# Description of Changes
Bumps the timeout for two typescript query builder tests. Both tests
were using the default 5 second timeout, and recently they seem to be
hitting it pretty consistently. I've increased the timeout to 15 seconds
for these two tests.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Typescript `build-and-test` should now pass in CI
## Summary
- Removing a table from the schema previously always triggered
`AutoMigrateError::RemoveTable`, forcing `--delete-data` which **nukes
the entire database**
- Now, empty tables can be dropped seamlessly during `spacetime publish`
- Non-empty tables fail with an actionable error guiding the user to
clear rows first
## Changes
- **`auto_migrate.rs`**: Replace hard `RemoveTable` error with
`AutoMigrateStep::RemoveTable`. Compute `removed_tables` set (same
pattern as `new_tables`) and pass to
`auto_migrate_indexes`/`auto_migrate_sequences`/`auto_migrate_constraints`
to skip sub-object diffs for removed tables (cascade handled by
`drop_table`)
- **`update.rs`**: Execute `RemoveTable` step — O(1) emptiness check via
`table_row_count_mut`, then `drop_table`. Fails with clear message if
table has data
- **`formatter.rs`** / **`termcolor_formatter.rs`**: Add
`format_remove_table` to `MigrationFormatter` trait + implementation
- **`publish.rs` (smoketests)**: Update existing test — removing empty
table now succeeds without flags
## Safety
- **Transaction safety**: Emptiness check and drop run in the same
`MutTx` — no window for concurrent inserts
- **Cascade**: `drop_table` already handles removing all indexes,
constraints, and sequences for the table
- **Sub-object filtering**: Indexes/constraints/sequences belonging to
removed tables are filtered from their respective diffs, preventing
orphan `RemoveIndex`/`RemoveConstraint`/`RemoveSequence` steps
- **Rollback**: If the emptiness check fails, the entire migration
aborts before any changes are applied
## Example error output (non-empty table)
```
Cannot remove table `MyTable`: table contains data. Clear the table's rows (e.g. via a reducer) before removing it from your schema.
```
## Test plan
- [x] `cargo check -p spacetimedb-schema -p spacetimedb-core` passes
- [x] All 14 `auto_migrate` schema tests pass (2 new:
`remove_table_produces_step`,
`remove_table_does_not_produce_orphan_sub_object_steps`)
- [x] All 3 `update` execution tests pass (2 new:
`remove_empty_table_succeeds`, `remove_nonempty_table_fails`)
- [x] Updated existing `auto_migration_errors` test (no longer expects
`RemoveTable` error)
- [x] Updated smoke test: `cli_can_publish_remove_empty_table` expects
success
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
# Description of Changes
Fixes#4584. Now we check, fully recursively, whether compound
`AlgebraicTypeUse`s contain a ref.
# Expected complexity level and risk
1
# Testing
- [x] Added a test to ensure correct codegen
---------
Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
# Description of Changes
Replace `.unwrap()` with `.map_err(log_and_500)?` on the
`get_database_by_identity()` call in the WebSocket subscribe handler
(`crates/client-api/src/routes/subscribe.rs`).
This was the only call site in the codebase using `.unwrap()` for this
method — all four other call sites in `database.rs` already use
`.map_err(log_and_500)?`. A transient database error during WebSocket
connection setup would panic the server instead of returning an HTTP
500.
Closes#4686
# API and ABI breaking changes
None.
# Expected complexity level and risk
1 — Single-token replacement matching the established pattern used
everywhere else.
# Testing
- [ ] Verified the fix matches the error-handling pattern used at all
other `get_database_by_identity` call sites
- [ ] Confirmed `log_and_500` is already imported in the file
# Description of Changes
Support is added to Rust, C#, C++, and TS modules.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
I wasn't able to regen module bindings for sdk tests, hitting an error
while doing so.
---------
Co-authored-by: Jason Larabie <jason@clockworklabs.io>
# Description of Changes
<!-- Please describe your change, mention any related tickets, and so on
here. -->
- This is causing a failure of the release dry-run check.
# API and ABI breaking changes
<!-- If this is an API or ABI breaking change, please apply the
corresponding GitHub label. -->
None
# Expected complexity level and risk
1 - this is just a release fix
<!--
How complicated do you think these changes are? Grade on a scale from 1
to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex
change.
This complexity rating applies not only to the complexity apparent in
the diff,
but also to its interactions with existing and future code.
If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning
ways. -->
# Testing
<!-- Describe any testing you've done, and any testing you'd like your
reviewers to do,
so that you're confident that all the changes work as expected! -->
- [x] Release dry-run passes
# Description of Changes
The commits in order do:
1. Drive-by: Small commit that makes `Table::is_row_present` delegate to
`TableInner::is_row_present`. Was found while debugging a test.
2. Adds `TableIndex::iter`, which is then used in
`reconstruct_index_num_key_bytes` so that it works for any type of
index. This will be used in a follow-up PR to extend test coverage to
all index kinds. The code sharing for btree indices and hash indices is
also improved.
3. Simplifies, with a macro, the definition of all index iterator types
so that each variant doesn't have to be mentioned twice. This should
make it easier to scale to new index kinds in the future.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
A test is amended to use `TableIndex::iter`. It will be exercised more
in a follow up PR.
Require a runtime handle and identifier for `spawn_tokio_stats`, so
multiple runtimes can be instrumented.
# Expected complexity level and risk
1
# Testing
n/a
# Description of Changes
Replaces the warmup period in the distributed version of the `keynote-2`
benchmark with an explicit start barrier.
1. Removes `--warmup-seconds` from the distributed benchmark flow
2. Adds an explicit `starting` phase where generators start their local
epoch and POST `/started`
3. Makes the coordinator wait for all participant start acknowledgements
before beginning the measured window
4. Adds `--start-ack-timeout-seconds` as the timeout for that start
barrier
5. Removes `warmupSeconds` from the distributed benchmark
protocol/result types
# API and ABI breaking changes
N/A
# Expected complexity level and risk
1.5
# Testing
N/A
# Description of Changes
We only benchmark the typescript module and sdk now.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
N/A
# Description of Changes
The JS worker is intentionally long-lived. Before this patch, we
essentially had a memory leak where V8 call-local handles and some
host-side call state would survive/accumulate across multiple calls on
the worker. That created gradual heap growth over time, more GC work,
and eventually enough slowdown and heap pressure that the isolate needed
to be replaced.
And even though we periodically check heap statistics to determine
if/when we need to replace the isolate, execution latencies can (and
did) degrade dramatically before this kicks in.
Now each reducer call is given a fresh V8 `HandleScope` in which to
execute instead of reusing a single global scope. This scope is then
dropped at the end of each run, which avoids retaining/accumulating
call-local JS objects over the JS worker's lifetime.
This patch also makes end-of-call host cleanup explicit, lowers the
default heap-check cadence, and limits exported heap metrics to the JS
worker only. Previously we had poor heap observability for diagnosis as
heap metrics from the instance pool (for procedures) could overwrite
heap metrics for the worker.
### What changed
#### 1. Add a fresh V8 `HandleScope` for every invocation
Each reducer, view, and procedure call now opens a nested V8 scope for
the duration of that call.
This preserves the existing long-lived isolate and context, but gives
every invocation its own temporary handle lifetime. Call-local V8
handles now die when the invocation returns instead of sticking around
until the worker exits.
As part of that refactor:
- Hook locals are rebuilt inside the per-call scope instead of being
tied to the worker-lifetime scope.
- The reducer args `ArrayBuffer` is stored and re-used as a
worker-lifetime local.
#### 2. Make end-of-call cleanup a real boundary
The V8 host now force-clears leftover per-call host state at the end of
a function call.
Specifically:
- Any row iterators left behind by guest code are cleared.
- Any unfinished timing spans are cleared.
- We log when that cleanup had to happen so leaked call-local state is
visible instead of silently persisting across invocations.
#### 3. Lower the default heap-check cadence
The default V8 heap policy is now more aggressive about checking worker
heap usage.
Defaults changed from:
- `heap-check-request-interval = 65536`
- `heap-check-time-interval = 30s`
to:
- `heap-check-request-interval = 4096`
- `heap-check-time-interval = 5s`
These settings remain configurable through the existing `v8-heap-policy`
config.
#### 4. Only export heap metrics for the instance-lane worker
Heap metrics now reflect only the long-lived instance lane.
Specifically:
- Exported V8 heap gauges are emitted only for
`worker_kind="instance_lane"`.
- Pooled workers no longer publish these heap metrics.
- Per-database metric cleanup was simplified accordingly.
This avoids the last-writer-wins issue from the pooled instances while
keeping the metrics focused on the worker that accumulates state over
time and is most relevant for long-run slowdown diagnosis.
# API and ABI breaking changes
None
# Expected complexity level and risk
2
# Testing
Manually tested via the keynote-2 benchmark. Will add the keynote
benchmark to CI which will serve as a regression test.
## Summary
On macOS/Apple platforms, Objective-C++ defines `Nil` as a macro
(`#define Nil nullptr` in `objc/objc.h`). Unreal Engine compiles all C++
as Objective-C++ on Apple platforms, so `FSpacetimeDBUuid::Nil()` in
`Builtins.h` collides with this macro and produces:
```
error: expected member name or ';' after declaration specifiers; 'nullptr' is a keyword in Objective-C++
```
Every UE project using the SpacetimeDB Unreal SDK on macOS hits this
build failure.
## Affected platforms
- macOS (any version) — Unreal Engine 5.x compiles as Objective-C++ on
Apple platforms
- Affects all SpacetimeDB Unreal SDK versions that expose
`FSpacetimeDBUuid::Nil()`
## Reproduction steps
1. Create or open any Unreal Engine project on macOS
2. Add the SpacetimeDB Unreal SDK (v2.1.0)
3. Build the project
4. Observe the build failure in `Builtins.h` at
`FSpacetimeDBUuid::Nil()`
## Fix
Added `#pragma push_macro` / `#undef Nil` / `#pragma pop_macro` guards
in `Builtins.h`:
- **Before struct definitions**: Save and undefine the `Nil` macro so it
doesn't interfere with `FSpacetimeDBUuid::Nil()` and any other
identifiers using `Nil`
- **After the last closing brace**: Restore the original `Nil` macro so
downstream Objective-C++ translation units are unaffected
This is the standard UE pattern for handling platform macro collisions
(similar to how UE itself handles `check`, `verify`, `TEXT`, etc. macro
conflicts).
## Test plan
- [ ] Verify macOS UE build succeeds with the SDK included
- [ ] Verify Windows/Linux builds are unaffected (the `#ifdef Nil` guard
is a no-op when the macro is not defined)
- [ ] Verify `FSpacetimeDBUuid::Nil()` and `NilUuid()` Blueprint
functions work correctly at runtime
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jason Larabie <jason@clockworklabs.io>
# Description of Changes
Previously, a module’s JS worker thread was fed through a zero-capacity
channel. That made every request handoff a rendezvous between the async
producer task and the single JS worker thread. Under high concurrency,
that synchronous handoff showed up as hot `flume`/lock/wakeup stacks on
the critical path - the JS worker thread.
This patch brings V8 execution in line with WASM which also uses an
unbounded request queue.
Changing that handoff to an unbounded queue decouples request producers
from the JS worker. Producers can enqueue work without synchronizing
directly with the worker on every request, and the worker can continue
draining queued requests without paying the rendezvous cost each time.
This shortens the critical path, reduces scheduler/locking overhead and
increases throughput.
# API and ABI breaking changes
None
# Expected complexity level and risk
2
# Testing
Manual performance testing
# Description of Changes
#### 1. Cache the connection-id hex prefix and reuse it when generating
event IDs
Previously, the SDK rebuilt the connection-id string on every event. Now
it computes the prefix once per connection-id update and reuses it.
#### 2. Replace Promise-chained inbound message processing with a
synchronous ordered drain loop
Now inbound messages are still processed through a direct drain loop
instead of a promise chain to reduce scheduler overhead.
#### 3. Cache encoded reducer/procedure names and use specialized
CallReducer/CallProcedure writers
Previously, reducer/procedure calls always went through the generic
`ClientMessage` object path and re-encoded the method name each time.
Now the SDK caches UTF-8 encoded reducer/procedure names and uses direct
`CallReducer` / `CallProcedure` writers on the hot path, while keeping
the generic path as fallback.
# API and ABI breaking changes
None
# Expected complexity level and risk
2
# Testing
Manual: 100K -> 130K TPS running keynote-2 benchmark on apple m2
```
MAX_INFLIGHT_PER_WORKER=512 pnpm run bench test-1 --seconds 10 --concurrency 50 --alpha 1.5 --connectors spacetimedb
```
# Description of Changes
Updates ts client defaults for keynote-2 bench to optimize throughput.
These numbers were derived from runs on an apple m2, but I'd be
surprised if this configuration was sub-optimal on other platforms.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Manual
# Description of Changes
Message compression is now configurable for both the rust and typescript
keynote benchmark clients with the default being no compression. Before
this patch the rust client was using no compression and the typescript
client was using gzip by default.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Manual
# Description of Changes
Was doing some profiling and found some low hanging optimizations for
the ts client. Namely caching table primary keys and reusing the writer
for reducer args.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
...
# Description of Changes
Explicitly disables core pinning on macos since it's not supported
anyway.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
N/A
Should have no behavior change since setting core affinity was basically
a no-op already.
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
Fixes#4736
# Description of Changes
Single-column BTree indexes accepted `Range<T>` in their type signatures
(via `IndexScanRangeBounds`), but the runtime `filter()` and `delete()`
implementations always serialized the argument as a point value using
`datastore_index_scan_point_bsatn`. Passing a `Range` object caused
`SyntaxError: Cannot convert [object Object] to a BigInt` because the
Range object was fed directly to the column serializer.
The fix adds `Range` detection to the single-column BTree index code
path in `runtime.ts`. When a `Range` is passed, it serializes the bounds
and calls `datastore_index_scan_range_bsatn` /
`datastore_delete_by_index_scan_range_bsatn` instead. Point queries
continue to use the existing fast path. The multi-column index code
already handled `Range` correctly — this brings single-column indexes to
parity.
# API and ABI breaking changes
None. This is a pure bugfix. Existing queries are unaffected, and Range
queries that previously crashed now work as the types imply.
# Expected complexity level and risk
Low. The change is localized to one code path in `runtime.ts` and
mirrors the existing multi-column range serialization logic. All 170
existing tests pass.
# Testing
- All 170 vitest tests pass
- Manually confirmed the fix resolves the reported `SyntaxError` in a
production SpacetimeDB TypeScript module using
`ctx.db.players.fooId.filter(new Range(...))` on a single-column BTree
index
Fixes#4592
Adds a "Supported Column Types" section to the Index documentation page
listing:
- **Supported types**: integers (all widths), bool, String, Identity,
ConnectionId, Uuid, Hash, and no-payload enums with
`#[derive(SpacetimeType)]`
- **Unsupported types**: floats (no total ordering),
ScheduleAt/TimeDuration/Timestamp (not yet supported, links #2650),
Vec/arrays, enums with payloads, nested structs
- **Workaround tip**: scaled integers for floating-point coordinates
- **Direct index restrictions**: only u8/u16/u32/u64 and no-payload
enums
The list is derived from the `FilterableValue` trait implementations in
`crates/lib/src/filterable_value.rs` and the direct index validation in
`crates/schema/src/def/validate/v9.rs`.
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
# Description of Changes
Introduces types `TypedIndexKey` and `IndexKey` to free the table index
code on its direct dependency on `AlgebraicValue`, allowing index scans
by BSATN, `RowRef`s as well without first going through
`AlgebraicValue`.
This also has the effect of optimizing string scans by avoiding
allocating in `AlgebraicValue::String`.
This also adds a byte-array based future optimization for
all-primitive-multi-column indexes.
Also in the future, this will enable optimizing `iter_by_col_eq`, which
is used by the frequent connect/disconnect logic.
# API and ABI breaking changes
None
# Expected complexity level and risk
3? Unsafe code and very load bearing code.
# Testing
Should be covered by existing tests.
Use a bounded channel for submitting transactions to the durability
layer, so as to apply backpressure to the db when transaction volume is
too high.
---------
Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
# Description of Changes
This cleans up the `keynote-2` benchmark CLI and fixes the `bench` path
that was broken after `demo` and `bench` started sharing a single
parser. `demo` and `bench` now parse their own command grammars instead
of sharing one import-time parser. This avoids `bench` inheriting
`demo`-style validation and breaking on `test-1 --connectors ...`. in
addition, `bench` also uses the same `cac`-builder as `demo` now, and
I've deleted the unused `runner_1.ts`.
# API and ABI breaking changes
None
# Expected complexity level and risk
3
# Testing
Manually tested the following from `templates/keynote-2`:
- `pnpm bench test-1 --seconds 10 --concurrency 50 --alpha 1.5
--connectors spacetimedb`
- `pnpm demo --seconds 1 --concurrency 5 --alpha 1.5 --systems
spacetimedb --skip-prep --no-animation`
- `deno run --sloppy-imports -A src/demo.ts --help`
- `deno run --sloppy-imports -A src/cli.ts --help`
- `deno run --sloppy-imports -A src/cli.ts test-1 --seconds 10
--concurrency 50 --alpha 1.5 --connectors spacetimedb`
# Description of Changes
While testing #4663, I discovered the server would crash from a V8 out
of memory error after processing many requests. Before #4663, this would
not happen. I theorized that because we now have a single JS worker that
can process an unbounded number of reducer calls over its lifetime, any
V8 heap retention that would previously have been spread across several
pooled isolates now accumulates in one isolate.
This patch now periodically collects heap statistics and forces GC or
replaces the isolate if memory cannot be reclaimed. This greatly reduces
the risk of hitting the V8 heap limit and crashing the server. It
doesn't remove the risk entirely however. But this risk was still
present before we switched to a single worker model in #4663. In order
to remove the risk of crashing the server entirely, we would need to run
V8 in a separate process.
# API and ABI breaking changes
None
# Expected complexity level and risk
3
# Testing
TBD
# Description of Changes
The current keynote-2 benchmarks pipelines operations via
`MAX_INFLIGHT_PER_WORKER` in order to simulate a large number of client
connections while running the benchmark locally or on a single machine.
This patch adds a distributed benchmark mode for `templates/keynote-2`
so explicit SpacetimeDB client connections can be spread across multiple
machines without changing the existing single-process benchmark flow.
This is a pure extension. `npm run bench` and the current
`src/core/runner.ts` path remain intact. The new distributed path adds a
small coordinator/generator/control-plane harness specifically for
multi-machine ts client runs.
- New CLI entry points `bench-dist-coordinator`, `bench-dist-generator`,
and `bench-dist-control` were added
- The coordinator defines the benchmark window
- Generators begin submitting requests during warmup, but warmup
transactions are excluded from TPS
- Throughput is measured from the server-side committed transfer
counter, not client-local TPS
- Each connection runs closed-loop with one request at a time in this
distributed mode
- Connection startup is bounded-parallel (`--open-parallelism`) to avoid
a connection storm
- Verification is run by the coordinator after the epoch
- Late generators can be registered after a run to increase load on the
server incrementally
- If a participating generator dies and never sends `/stopped`, the
epoch result is flagged with an error so the run can be retried cleanly
See `DEVELOP.md` for instructions on how to run.
# API and ABI breaking changes
N/A
# Expected complexity level and risk
3
# Testing
Manual
# Description of Changes
This recovers a bit of lost performance from event tables by:
- avoiding putting the `row: ProductValue` in a vector before merging
into the committed state.
- directly constructing `Arc<[ProductValue]>`
This also shaves off 8 bytes from `ReplicaCtx`.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
Covered by existing tests.
# Description of Changes
Now we get a `--help` for the benchmark, which is nicer. Also now can
run under deno, with `deno --sloppy-imports -A src/demo.ts` (might be
useful, deno's websocket is implemented in native code while node's is
implemented in JS). I removed the
[BOM](https://en.wikipedia.org/wiki/Byte_order_mark) because it seems
unintentional (only found in `templates/keynote-2`) and was causing a
little bit of weirdness.
Also, fix the rust benchmark client as a follow-up to #4616
# Expected complexity level and risk
1
# Testing
- [x] Works under deno and has usage
# Description of Changes
Before this change, JS reducer requests borrowed a `JsInstance` from a
pool. If no idle instance was available, we created another instance,
which meant another V8 worker thread. Under load, this meant reducers
bouncing across multiple OS threads.
After this change, JS reducers go through a single long-lived
`JsInstance` fed by a FIFO queue which results in much better cache
locality. More accurately, each module now allocates a single OS thread,
on which reducers (and most operations) run. Modules do not share
workers/threads. And modules do not create multiple threads for running
reducers.
Note, the original instance pool is still used for procedures. It should
probably be bounded, but I didn't make any changes to it. It's also used
for executing views during initial subscription to avoid a reentrancy
deadlock. The latter should be fixed and moved over to the JS worker
thread at some point.
# API and ABI breaking changes
N/A
# Expected complexity level and risk
4
# Testing
```
NODE_OPTIONS="--max-old-space-size=8192" \
MAX_INFLIGHT_PER_WORKER=512 \
BENCH_PRECOMPUTED_TRANSFER_PAIRS=1000000 \
pnpm bench test-1 --seconds 10 --concurrency 50 --alpha 1.5 --connectors spacetimedb
```
```
50K TPS -> 85K TPS on m2 mac
```
This is the implementation to resolve#4424
# Description of Changes
* Add support for `IEnumerable<T>` as a valid C# View return type in
codegen, so view implementations can return query results without
forcing `ToList()`. The generator now detects `IEnumerable<T>`,
serializes it via the list serializer.
* Snapshots have been updated with a test to confirm that
`IEnumerable<T>` rather than checking it is unsupported (which we had
added when removing support during the transition away from `Query<T>`
usage.
* Refreshes docs/examples to reflect the new supported signature.
# API and ABI breaking changes
No breaking changes. This is additive support for an existing .NET
interface.
# Expected complexity level and risk
2 - Low
# Testing
- [X] Rebuilt CLI locally, ran regression tests, and `dotnet tests` in
C# module bindings.
- [X] Tested updated docs code snippets to ensure they resolve in an IDE
and publish.
---------
Signed-off-by: Ryan <r.ekhoff@clockworklabs.io>
# Description of Changes
This PR adapts the Rust SDK test suite to work with the wasm version
added in https://github.com/clockworklabs/SpacetimeDB/pull/4089 (which
I've closed in favor of this PR).
Most of the changes revolve around wasm's different async semantics -
everything runs in one thread, so things that relied on background
threads didn't work directly. Several tests would lock up because
something in them blocked synchronously, which blocked any background
work from progressing.
We moved the test-clients contents into a `test_handlers.rs` so that it
could be called from both `main` (for native tests) and `lib` (for wasm
tests). To show what actually changed, use:
```bash
git diff --no-index -- <(git show origin/master:sdks/rust/tests/procedure-client/src/main.rs) sdks/rust/tests/procedure-client/src/test_handlers.rs
```
(or similar for other test-clients)
# API and ABI breaking changes
None, I think/hope.
# Expected complexity level and risk
2
# Testing
- [x] I've augmented the CI to also run the test suite with the `web`
feature
---------
Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
Co-authored-by: Thales R <thlsrmsdev@gmail.com>
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
# Description of Changes
Fix the SQLite RPC benchmark so transfers actually persist and
verification produces useful output.
**`sqlite-rpc-server.ts`:**
- Add `.run()` to both Drizzle `tx.update()` chains in `rpcTransfer`.
Without this, the UPDATE statements were never executed — the benchmark
was only measuring SELECT + HTTP overhead, not real transactional
writes. This inflated SQLite TPS numbers significantly.
- Replace the generic `"internal error"` catch-all in `handleRpc` with
`rpcErr()`, which returns the actual error message to the client. Also
wrap the outer HTTP handler in a try/catch so errors outside `handleRpc`
are surfaced too.
**`sqlite_rpc.ts` (connector):**
- Detect `{ skipped: true }` from the server's verify endpoint and throw
a clear error telling the user to set `SEED_INITIAL_BALANCE` on the RPC
server process. Previously this was silently treated as success.
**`runner.ts` / `runner_1.ts`:**
- Log "Verification passed" on success and "Verification failed:
\<reason\>" (as a string, not a raw Error object) on failure, so the
outcome is always visible in bench output.
# API and ABI breaking changes
None.
# Expected complexity level and risk
1
# Testing
- [ ] Run `npx tsx src/rpc-servers/sqlite-rpc-server.ts` (with
`SEED_INITIAL_BALANCE` set), then `npm run test-1 -- --connectors
sqlite_rpc --seconds 5` with `VERIFY=1`. Confirm TPS drops significantly
vs the old (broken) numbers and "Verification passed" appears.
- [ ] Run the same without `SEED_INITIAL_BALANCE` on the server process.
Confirm "Verification failed" with a message about the missing env var
(not "internal error").