# Description of Changes
The PR adds some benchmarks meant to compare how we encode composite
indices for primitive types today (`_AV`) versus how we intend to store
them in the future.
These benchmarks sample the standard uniform distribution for some
static types, e.g., `(i32, i32, u64)` and then either converts them to
how we store keys today (`_AV`, stands for `AlgebraicValue`), or encodes
them into a byte-array (`_Enc`, stands for "encoded), e.g., `[u8; 16]`
in the case of the bitcraft index, which is what the future optimization
would entail.
The comparison for each type is then between `_AV` and `_Enc`.
Here are some numbers on an i9-14900K:
```
IFoldHash<I32xI32xU64_AV>/insert_random
time: [85.817 ns 87.344 ns 88.648 ns]
IFoldHash<I32xI32xU64_Enc>/insert_random
time: [41.329 ns 42.217 ns 43.120 ns]
IFoldHash<I32xI32xU64_AV>/seek_random
time: [108.48 ns 111.47 ns 114.05 ns]
IFoldHash<I32xI32xU64_Enc>/seek_random
time: [43.468 ns 45.974 ns 48.468 ns]
IFoldHash<I32xI32xU64_AV>/delete_random
time: [112.36 ns 120.10 ns 127.67 ns]
IFoldHash<I32xI32xU64_Enc>/delete_random
time: [49.078 ns 51.745 ns 54.182 ns]
```
These benchmarks strongly suggest that the future optimization will be
highly profitable.
# API and ABI breaking changes
None
# Expected complexity level and risk
0, not even trivial :)
# Testing
Nothing to test / this is benchmark code only.
# Description of Changes
The goal of this PR is to:
- Make it easy to switch our backing implementation of identifiers, as
seen in the commit changing to `LeanString`.
- Improve type safety of our identifiers by having every identifier
start with `RawIdentifier` and then `Identifier` is just a wrapper with
validation on construction. `TableName` and `ReducerName` are then just
wrappers around `Identifier`.
- Reduce allocation in `InstanceEnv` by using the now clone-on-write
O(1) + SSO optimized `Identifier`.
- Reduce allocations in the query engine as a consequence of improving
`RawIdentifier` and `Identifier`.
- Have `&'static str` strings be further optimized by avoiding to
allocate even for long ones. This is supported by `impl From<&'static
str> for RawIdentifier` as well as switching to `LeanString`.
The PR results in a roughly 2k TPS improvement over master for WASM +
hash indices. This should also help V8, as this is VM-agnostic.
# API and ABI breaking changes
None
# Expected complexity level and risk
3? Mostly just small changes in many places, but in a sea of small
changes, mistakes can happen.
# Testing
Covered by existing tests.
# Description of Changes
The first commit defines a type `TableName` that is used in e.g.,
`TxData` and where determined profitable and necessary to do this
change.
`TableName` is backed by
[`ecow::EcoString`](https://docs.rs/ecow/0.2.6/ecow/string/struct.EcoString.html)
which affords O(1) clones and 15 bytes of inline storage and
`mem::size_of::<EcoString>() == 16`.
The second commit does the same for `ReducerName`. This is also used in
reducer execution.
Together, these commits increase TPS by around 5-7k TPS.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Covered by existing tests.
# Description of Changes
Exploit `(ty: RowTypeLayout).layout.fixed` to avoid some work involving
var-len stuff:
- skip `required_var_len_granules_for_row` for only-fixed-len layouts.
- skip `write_large_blobs` for only-fixed-len layouts.
- add a fast path in `has_space_for_row` for only-fixed-len layouts.
This resulted in these methods disappearing in flamegraphs.
# API and ABI breaking changes
None
# Expected complexity level and risk
2, core datastore, but fairly trivial.
# Testing
Covered by existing tests.
# Description of Changes
See tin.
Will be used in a follow up PR to optimize merge.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Exercised in a follow up PR.
# Description of Changes
See tin.
Will be used to optimize merge.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
Exercised in follow up.
# Description of Changes
Makes `Pages::copy_filter` more flexible by allowing passing in no blob
store (policy) so that the in a follow PR, blob stores can be merged
wholesale instead, so that transaction merging can be optimized.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
`copy_filter` and friends are currently unused but will be used in
follow ups.
# Description of Changes
Despecializes direct indices for `u64` keys that have too large values
and convert them into a B-Tree index instead.
Otherwise, we are exposed to OOM aborts.
Also hardens typed index tests to cover more combinations.
This is how the bug was caught.
Some index code is also simplified.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
Proptests were generalized to catch the bug and to then verify that it
was fixed.
# Description of Changes
Fixes https://github.com/clockworklabs/SpacetimeDB/issues/1122.
Adds hash indices and exposes them through `#[index(hash)]` for Rust
modules,
with support for typescript and C# to come in follow ups.
On the client/sdk side, for now, any index is backed via a BTree/native
index as it is the most general.
A hash index may only be queried through point scan and never ranged
scans.
Attempting a ranged scan results in an error, with the mechanism
implemented in the previous PR
(https://github.com/clockworklabs/SpacetimeDB/pull/3974).
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
A test for ensuring that hash indices cannot be used for range scans is
added.
Tests exercising hash indices will come in the next PR.
# Description of Changes
When doing a ranged seek on a non-ranged index (none such exist yet, but
will be added in a follow up), return an (ABI) error.
Also:
- Use point scans in query execution (`IxScan(Delta)Eq`).
- Refactor table index code with macro `same_for_all_types`.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
Testing the error handling will be possible once hash indices are added
(follow up PR).
# Description of Changes
Provides common traits `Index` and `RangedIndex` that all (current)
variants in `TypedIndex` adhere to.
This is then used to simplify `TypedIndex` by merging common code,
with more simplifications to come later.
The responsibility of tracking statistics is also moved into each index
type, as some can exploit their properties to provide some statistics
for free, rather than storing statistics in fields.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
This is mostly code motion, so it is covered by existing tests.
Further test improvements will come in a follow up PR.
# Description of Changes
Fixes https://github.com/clockworklabs/SpacetimeDB/issues/3240.
Non-unique indices are now backed by a type `SameKeyEntry` which holds
the `RowPointer`s for the same key.
When these `RowPointer`s exceed 4KiB (512 entries), the data structure
switches from using an array list to a hash set.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
Covered by existing tests, though more test will come in future PRs.
# Description of Changes
Prior to this commit, we had special handling for deletes from
`st_table` during replay, but we did not pair them with inserts to form
updates, instead only treating the delete as a dropped table.
With this commit, we record inserts to `st_table` which will form update
pairs during replay, and handle them appropriately, updating the table's
schema and not dropping the table.
There is a tricky case where a table exists but is empty, and then
within a single transaction:
- The table undergoes a migration s.t. its `table_access` or
`primary_key` changes.
- At least one row is inserted into the table.
In this case, the in-memory table structure will not exist at the point
of the `st_table` insert, but will be created before the corresponding
`st_table` delete, meaning there will be two conflicting `st_table` rows
resident. To handle this case, `CommittedState` tracks a side table,
`replay_table_updated`, which stores a `RowPointer` to the correct
most-recent `st_table` row for the migrating table.
I've also renamed the one previously-extant replay-only side table,
`table_dropped`, to include the `replay_` prefix, which IMO improves
clarity. And I've made it so `replay_table_dropped` is cleared at the
end of each transaction, as the previous behavior of continuing to
ignore a table that should be unreachable masked errors which would have
been helpful when debugging this issue.
This PR also includes an extended error message when encountering a
unique constraint violation while replaying, which I found helpful while
debugging.
# API and ABI breaking changes
N/a
# Expected complexity level and risk
2 - replay is complicated and scary, but this PR isn't gonna make things
*more* broken than they already were.
# Testing
- [x] Manually replayed a commitlog which included a migration that
altered a table's `table_access`, which was previously broken but now
replays successfully.
# Description of Changes
Fixes https://github.com/clockworklabs/SpacetimeDB/issues/2824.
Defines a global pool `BsatnRowListBuilderPool` which reclaims the
buffers of a `ServerMessage<BsatnFormat>` and which is then used when
building new `ServerMessage<BsatnFormat>`s.
Notes:
1. The new pool `BsatnRowListBuilderPool` reports the same kind of
metrics to prometheus as `PagePool` does.
2. `BsatnRowListBuilder` now works in terms of `BytesMut`.
3. The trait method `fn to_bsatn_extend` is redefined to be capable of
dealing with `BytesMut` as well as `Vec<u8>`.
4. A trait `ConsumeEachBuffer` is defined from
`ServerMessage<BsatnFormat>` and down to extract buffers.
`<ServerMessage<_> as ConsumeEachBuffer>::consume_each_buffer(...)` is
then called in `messages::serialize(...)` just after bsatn-encoding the
entire message and before any compression is done. This is the place
where the pool reclaims buffers.
# Benchmarks
Benchmark numbers vs. master using `cargo bench --bench subscription --
--baseline subs` on i7-7700K, 64GB RAM:
```
footprint-scan time: [21.607 ms 21.873 ms 22.187 ms]
change: [-62.090% -61.438% -60.787%] (p = 0.00 < 0.05)
Performance has improved.
full-scan time: [22.185 ms 22.245 ms 22.324 ms]
change: [-36.884% -36.497% -36.166%] (p = 0.00 < 0.05)
Performance has improved.
```
The improvements in `footprint-scan` are mostly thanks to
https://github.com/clockworklabs/SpacetimeDB/pull/2918, but 7 ms of the
improvements here are thanks to the pool. The improvements to
`full-scan` should be only thanks to the pool.
# API and ABI breaking changes
None
# Expected complexity level and risk
2?
# Testing
- Tests for `Pool<T>` also apply to `BsatnRowListBuilderPool`.
# Description of Changes
Provides new WASM ABIs:
- `datastore_index_scan_point_bsatn`
- `datastore_delete_by_index_scan_point_bsatn`
These are then used where applicable to speed up `.find(_)` and friends.
Point scans are also used more internally where applicable.
What remains after this is use in C# module bindings and to expose this
in TS as well.
The PR makes TPS go from roughly 36k to 38k TPS on my machine and also
makes a difference in flamegraphs where the time spent in some index
scans are substantially decreased.
# API and ABI breaking changes
None
# Expected complexity level and risk
3? This touches the datastore an how we expose it to modules.
# Testing
Some existing tests now exercise the new ABIs by changing what
`.find(_)` and friends do.
---------
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
# Description of Changes
Fixes https://github.com/clockworklabs/SpacetimeDB/issues/3617.
# API and ABI breaking changes
None
# Expected complexity level and risk
1
# Testing
A proptest `empty_range_scans_dont_panic` is added.
# Description of Changes
There were mentions of `hashbrown` in the repo that did not go through
`spacetimedb_data_structures::map`.
This caused compile errors on master when running certain tests locally.
These have been replaced with the proper imports.
The PR also bump hashbrown to 0.16.1 and foldhash to 0.2.0.
# API and ABI breaking changes
None
# Expected complexity level and risk
2
# Testing
Covered by existing tests.
# Description of Changes
Updates views atomically on commit, but before downgrading to a
read-only transaction for subscription evaluation.
What this patch does:
1. Renames `ViewId` to `ViewFnPtr`
2. Renames `ViewDatabaseId` to `ViewId`
3. Removes the `module_rx` module watcher from the subscription manager
4. Refactors read sets to only track table scans (index key tracking
will be added later)
5. Drops read sets and removes rows from `st_view_sub` when dropping a
view in an auto-migrate
6. Re-evaluates and updates views (`call_views_with_tx`) from
`call_reducer_with_tx` for any view whose read set overlaps with the
reducer's write set
7. Does the same for sql dml
# API and ABI breaking changes
None
# Expected complexity level and risk
3
It's a bit of a messy diff.
# Testing
- [x] Integrate with
https://github.com/clockworklabs/SpacetimeDB/pull/3616
---------
Signed-off-by: joshua-spacetime <josh@clockworklabs.io>
Co-authored-by: Shubham Mishra <shivam828787@gmail.com>
# Description of Changes
Host implementation to invoke `call_view` method.
I also covers:
1. API `MutTxId::is_materialized`to check if existing view exisits and
updated.
2. Update in readsets logic to remove stale views.
3. sql caller implmentation.
# API and ABI breaking changes
NA
How complicated do you think these changes are? Grade on a scale from 1
to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex
change.
3
# Description of Changes
Refactored `st_view_client` and renamed it `st_view_sub` which tracks
the number of clients subscribed to a view. On disconnect, we decrement
the `num_subscribers` column in the appropriate rows. An async task will
be in charge of cleaning up views (and their read sets) whose subscriber
count has gone to zero (not in this patch).
On module init, we clear the entirety of each view table.
# API and ABI breaking changes
None. Technically this updates the schema of a system table, but the
system table was added and modified between releases.
# Expected complexity level and risk
~2
Need to make sure we cover all cases so that we don't leave dangling
data. Making these tables ephemeral in the future should simplify this.
# Testing
Will add tests once we can subscribe to views
# Description of Changes
Not many changes were required for the query compiler to be able to
resolve views. This is because the query engine can always assume a view
is materialized and therefore has a backing table. So from the
perspective of the query engine, a view is just another table with one
small caveat: The physical table in the datastore has two internal
metadata columns - `sender` and `arg_id`. These columns are not user
facing and so should be hidden from name resolution/type checking.
# API and ABI breaking changes
None
# Expected complexity level and risk
1.5
# Testing
<!-- Describe any testing you've done, and any testing you'd like your
reviewers to do,
so that you're confident that all the changes work as expected! -->
- [x] SQL type checking tests
# Description of Changes
<!-- Please describe your change, mention any related tickets, and so on
here. -->
This patch defines the system table schemas for views and allocates IDs
for them. It **does not** populate these tables.
# API and ABI breaking changes
<!-- If this is an API or ABI breaking change, please apply the
corresponding GitHub label. -->
None
# Expected complexity level and risk
<!--
How complicated do you think these changes are? Grade on a scale from 1
to 5,
where 1 is a trivial change, and 5 is a deep-reaching and complex
change.
This complexity rating applies not only to the complexity apparent in
the diff,
but also to its interactions with existing and future code.
If you answered more than a 2, explain what is complex about the PR,
and what other components it interacts with in potentially concerning
ways. -->
1
# Testing
<!-- Describe any testing you've done, and any testing you'd like your
reviewers to do,
so that you're confident that all the changes work as expected! -->
Tests will be added in the patch that populates these tables
TableIndex merge logic to was missing variants.
# Description of Changes
This pull request fixes an issue where the TableIndex::can_merge
function was missing match variants.
# API and ABI breaking changes
None.
# Expected complexity level and risk
1 / 5 (Trivial)
This is a very low-risk change. It simply adds a few lines to a match
statement
# Testing
Manual testing confirm that creating two tables with indices with the
for the missing variants no longer causes a panic.
# Description of Changes
Necessary for pulling in rolldown.
# API and ABI breaking changes
None
# Expected complexity level and risk
1, with the caveat that this updates the Rust version and therefore
touches all the code.
# Testing
- [ ] Just the automated testing
# Description of Changes
Implements `Serialize` and `Deserialize` for tuples.
Extracted from https://github.com/clockworklabs/SpacetimeDB/pull/3276.
# API and ABI breaking changes
None
# Expected complexity level and risk
2
# Testing
A test `roundtrip_tuples_in_different_data_formats` is added.
# Description of Changes
- `add_columns_to_table` api to handle `AutoMigrateStep::AddColumns` --
Look at doc comment for more Info.
depends on: https://github.com/clockworklabs/SpacetimeDB/pull/3261
TODO: handle `AutoMigrateStep::DisconnectAllUsers`.
# API and ABI breaking changes
N/A.
# Expected complexity level and risk
2? Changes are not in hotpath but a bug in migration can corrupt tables.
# Testing
a test to exercise the API.
---------
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: Shubham Mishra <shivam828787@gmail.com>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
# Description of Changes
Aternative to and closes
https://github.com/clockworklabs/SpacetimeDB/pull/3210.
This version relies on `pending_schema_changes`.
The first commit adds `clear_table` to the datastore that's efficient
and can be exposed to the module ABI in a follow up.
The second commit fixes `drop_table`.
# API and ABI breaking changes
None
# Expected complexity level and risk
3?
# Testing
`test_drop_table_is_transactional` is amended to check `TxData`.
---------
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Shubham Mishra <shubham@clockworklabs.io>
Co-authored-by: Shubham Mishra <shivam828787@gmail.com>
# Description of Changes
Title. When validating column type changes in `Table::change_columns_to`
and comparing layouts in `RowTypeLayout::is_compatible_with`, return
structured error objects which describe the mismatch.
I made this change while debugging the issue that led to #3203 , where
`change_columns_to` returned an error during replay of an automigration
which had succeeded the first time. The original error contained enough
information to debug, but it was presented in a noisy and unhelpful way,
so I wrote this patch to get better diagnostics.
Per @Centril 's comments, these errors are for internal assertions, not
user-facing error reporting, so we value fail-fast rather than error
hygiene and choose not to use the `ErrorStream` combinator.
Also note that I sprinkled `Box`es around some large-ish error types to
quiet
https://rust-lang.github.io/rust-clippy/master/index.html#result_large_err
.
# API and ABI breaking changes
N/a
# Expected complexity level and risk
1.
# Testing
Manual testing with BitCraft. As this is purely altering the error
reporting for non-user-facing assertions, I don't believe any further
testing is necessary.
# Description of Changes
We recently merged several repos together. This PR clarifies the license
terms for several subdirectories, as well as the relationship between
the licenses.
The licenses in our subdirectories have become symbolic links to
licenses in our toplevel `licenses` directory. For any particular
subdirectory's license file in the diff, you can click `... -> View
file` and then click on the text that says "Symbolic Link" on that page.
This will take you to the license file that it links to.
I have also updated the `tools/upgrade-version` script to update the
change date in the new `licenses/BSL.txt` file.
# API and ABI breaking changes
None.
# Expected complexity level and risk
1
# Testing
None. Only changes to license files.
---------
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>