Commit Graph

63 Commits

Author SHA1 Message Date
Mazdak Farrokhzad da71d0f9b1 WASM ABI: insert -> datastore_insert_bsatn & impl new semantics (#1639) 2024-09-05 19:32:26 +00:00
james gilles 2be42156b2 Allow converting new ModuleDef to old TableSchema (#1630) 2024-08-29 18:17:29 +00:00
Mazdak Farrokhzad 3be5c83d99 [WASM ABI 1.0] __call_reducer__ receives Identity & Address by value (#1607)
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Ingvar Stepanyan <me@rreverser.com>
Co-authored-by: Noa <coolreader18@gmail.com>
2024-08-19 22:20:57 +00:00
Mazdak Farrokhzad 1ca9b1a933 [WASM ABI 1.0] Change ColId from u32 to u16 (#1597) 2024-08-19 17:56:28 +00:00
Mazdak Farrokhzad 6a08674ccb Allow empty ColList (#1588) 2024-08-15 16:08:51 +00:00
Mazdak Farrokhzad 1e8e18d74b Add support for I256 and U256 (#1477) 2024-08-08 18:40:35 +00:00
Mazdak Farrokhzad 3340ceea8a SATS: Flatten AlgebraicType, getting rid of BuiltinType (#1559)
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: joshua-spacetime <josh@clockworklabs.io>
Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-08-06 18:15:19 +00:00
Kim Altintop 85082077e2 table: Make with_mut_schema clone-on-write (#1530) 2024-07-29 13:45:21 +00:00
james gilles 45b2ceee9a Move schemas to schema crate, rename Def to RawDefV8 (#1498) 2024-07-24 17:38:30 +00:00
james gilles f81f2a7492 Move db module from spacetimedb_sats to spacetimedb_lib (#1479) 2024-07-17 20:59:44 +00:00
Phoebe Goldman 04a7508120 Table::is_row_present: don't panic (#1526) 2024-07-17 18:41:04 +00:00
Phoebe Goldman 6c45e76a98 Integrate snapshotting into core (#1344) 2024-06-11 12:40:02 +00:00
Phoebe Goldman 8c5f40db8d Add the snapshot crate, which implements snapshotting at a low level (#1340)
* Add the `snapshot` crate, which implements snapshotting at a low level

- Requires making `BlobHash` be `Serialize` and `Deserialize`.
  For arcane macro-ology reasons, this requires writing `BlobHash::SIZE`
  instead of `Self::SIZE` (it gets embedded in a visitor struct or something).
- Requires adding two new operators to `BlobStore`.
- Adds a return value to `Page::save_content_hash`, for convenience.
- Impls `DerefMut` for `Pages`.
- **Scary change:** adds `Table::pages_mut`.
  I think possibly this operator should be `unsafe`,
  since write access to the `Pages` allows an undisciplined caller
  to violate the `Table`'s assumptions by corrupting a `Page`.
  It seems like an anti-pattern to mark a method `unsafe` on the grounds that
  misusing its return value can cause UB,
  but I don't see a plausible alternative
  without making most methods on `Page` unsafe.
  Open to feedback on this one!

* Nix `Table::pages_mut`

* Address Mazdak's feedback

* Use `thiserror` rather than `anyhow` for better error hygiene
2024-06-05 21:58:12 +00:00
Phoebe Goldman a214f78f0b Impl Serialize, Deserialize for Page (#1335)
* Impl `Serialize`, `Deserialize` for `Page`

Snapshotting needs to write `Page`s to files and read them back again.
To that effect, this commit implements `Serialize` and `Deserialize` for `Page`.

* Address Mazdak's review

- Fix soundness in `FixedBitSet` by moving an assert.
- Add commentary to test.
- Add commentary to `spacetimedb-lib` dependency.
2024-06-04 15:49:27 +00:00
Shubham Mishra cf4b9aa282 metric for table size (#1319)
* table size metric

* feld blob_store_bytes in table

* address comments

* NumBlobBytes type

* table size metrics: adjust comments, visibility + harden test

---------

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-05-31 17:44:12 +00:00
Mazdak Farrokhzad 25513b37b9 Simplify btree_index module with more idiomatic Rust (#1285)
* simplify btree_index module, more idiomatic Rust

* test
2024-05-23 13:39:48 +00:00
joshua-spacetime 88a8adad70 feat(1231): Basic query cardinality estimation (#1273)
* feat(1231): Basic query cardinality estimation

This patch implements basic cardinality estimation for QueryExpr.
It utilizes table cardinalities and number of distinct values for index related operators.

* estimation tests: dedup + define constants for readability

* row_est: simplify with slice patterns

* fn ndv -> fn num_distinict_values

* simplify TypedIndex::num_keys

* is_range -> is_point (invert) + fuse arms in row_est

* estimation: fix logic for IndexJoin

---------

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-05-23 08:49:07 +00:00
Mazdak Farrokhzad ebc921849e privatize Table::row_layout + related BTreeIndex refactoring (#1262) 2024-05-20 18:44:04 +00:00
Mazdak Farrokhzad 91f7e8c917 add PageHeader::unmodified_hash, a BLAKE3 hash for snapshotting (#1249) 2024-05-20 17:47:42 +00:00
Mazdak Farrokhzad e109385c1e remove BTreeIndex::name again (#1251) 2024-05-20 15:52:57 +00:00
Mazdak Farrokhzad d188f966c2 move static bsatn layout to module + harden test (#1254) 2024-05-20 15:52:35 +00:00
Mazdak Farrokhzad 0b89165cec - Table::get_fixed_row -> RowRef::get_row_data (#1250)
- Document some table methods
2024-05-20 07:14:35 +00:00
Shubham Mishra 0c0567ecbf row_count field to table (#1242)
* rowcount

* added tests
2024-05-17 17:48:58 +00:00
Mazdak Farrokhzad 5154f7969e to_bsatn_extend/vec: use uninit instead of zeroed buffer (#1204) 2024-05-14 17:03:16 +00:00
Noa 3b754f10b1 Bump to Rust 1.78 (#1205)
* Bump to rust 1.78

* Fix lints
2024-05-08 14:20:12 +00:00
Phoebe Goldman 484ba824ba Make Page always fully init (#1193)
* Make `Page` always fully init

Per discussion on the snapshotting proposal,
this PR changes the type of `Page.row_data` to `[u8; _]`,
where previously it was `[MaybeUninit<u8>; _]`.

This turns out to be shockingly easy,
as our serialization codepaths never write padding bytes into a page.
The only place pages ever became `poison` was the initial allocation;
changing this to `alloc_zeroed` causes the `row_data` to always be valid at `[u8; _]`.

The majority of this diff is replacing `MaybeUninit`-specific operators
with their initialized equivalents,
and updating comments and documentation to reflect the new requirements.

This change also revealed a bug in the benchmarks
introduced when we swapped the order of sum tags and payloads
( https://github.com/clockworklabs/SpacetimeDB/pull/1063 ),
where benchmarks used a hardcoded offset for the tag which had not been updated.

* Update blake3

Blake3 only supports running under Miri as of 1.15.1, the latest version.
Prior versions hard-depended on SIMD intrinsics which Miri doesn't support.

* Address Mazdak's review.

Still pending his agreeing with me that `poison` is a better name than `uninit`.

* "Poison" -> "uninit"

Against my best wishes, for consistency with the broader Rust community's poor choices.

* Remove unnecessary `unsafe` blocks

* More unnecessary `unsafe`; remove forgotten SAFETY comments
2024-05-02 23:15:48 +00:00
Mazdak Farrokhzad e526c8c113 Fix soundness hole in Table::delete + don't make & immedately drop PVs in the method (#1162)
* impl Eq + Hash for RelValue

* Use Hash for RelValue in incr-eval

* naming: spell out pv, rv, and tro

* fix soundness hole in Table::delete + don't make + drop PVs

* Clarify `Table::delete`'s callback `before`

Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>

---------

Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
2024-04-30 22:30:50 +00:00
Mazdak Farrokhzad 0142e14de5 Implement RelValue: Eq + Hash (#1107)
* impl Eq + Hash for RelValue

* Use Hash for RelValue in incr-eval

* naming: spell out pv, rv, and tro
2024-04-30 22:13:50 +00:00
Mazdak Farrokhzad b55121cc83 use a custom FixedBitSet + optimize Page::iter_fixed_len (#1160) 2024-04-30 21:57:28 +00:00
Mazdak Farrokhzad 2c07b3bd69 impl PartialEq<ProductValue> for RowRef (#1164)
* impl PartialEq<ProductValue> for RowRef

* Apply Phoebe's suggestions

Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>

---------

Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
2024-04-30 21:20:32 +00:00
Mazdak Farrokhzad fd44242e99 1. Add Hash for RowRef + make it consistent with PV. (#1163)
2. Make `RowRef::row_hash` use the above.
3. Make `Table::insert` return a `RowRef`.
4. Use less unsafe because of 1-3.
5. Use `second-stack` to reuse temporary allocations in hashing and serialization.
2024-04-30 17:59:58 +00:00
Mazdak Farrokhzad 516dfe376c impl Eq for RowRef (#1135) 2024-04-25 01:09:26 +00:00
Mazdak Farrokhzad cb0c09bab0 Define Hash + Eq for BSATN (#1112)
* add hash_bsatn + move proptest generators to sats crate

* add eq_bsatn
2024-04-24 23:06:22 +00:00
Mazdak Farrokhzad 9797695ef6 multimap: don't sort values, use push & swap_remove (#1029) 2024-04-22 10:01:17 +00:00
Mazdak Farrokhzad f560101551 Make Table::clone_structure cheaper by: (#1090)
- Arcing `TableSchema`, and this has benefits elsewhere too.
- Arc<[_]>ing the visitor program instructions.

The data behind the Arcs very rarely change,
which is the perfect case for an Arc.
2024-04-16 19:07:36 +00:00
Mazdak Farrokhzad d6815ebf9c Shrink AV and AT to 24 & 16 bytes respectively, and also friends. (#1047) 2024-04-13 16:51:18 +00:00
james gilles 1c2e63e0a4 Table: skip alignment checks in eq_row_in_page and hash_row_in_page (#1085)
* Table: skip alignment checks in eq_row_in_page and hash_row_in_page

* Whoops, those comments can stay the same.
2024-04-12 16:05:34 +00:00
james gilles b9cee3d09f Swap the location of tags in the BFLATN encoding (#1063)
* Swap the location of tags to go before variant data in the BFLATN encoding

* Fix a comment

* Apply suggestions from code review (@gefjon @Centril)

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: james gilles <jameshgilles@gmail.com>

* Implement memcpy consolidation for sums

* Vanquish clippy

---------

Signed-off-by: james gilles <jameshgilles@gmail.com>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
2024-04-11 20:10:00 +00:00
Mazdak Farrokhzad 344861f290 use nohasher_hash and ahash instead of siphash13 (#1040)
* use nohasher_hash and ahash instead of siphash13

* re-export types in spacetimedb_data_structures::map
2024-04-05 17:30:51 +00:00
Mazdak Farrokhzad b6c0e1c4d8 Add AlgebraicValue::take + move test-code in btree_index to tests (#1028)
* add AlgebraicValue::take for a neater interface

* btree_index: move test-only code to tests
2024-03-27 17:05:19 +00:00
Mazdak Farrokhzad 9141a42622 Bump Rust to 1.77 + fix warnings + use Bound::map (#1020)
* bump Rust to 1.77 + fix warnings + use Bound::map

* use .truncate(true) for OpenOptions
2024-03-25 20:27:08 +00:00
Phoebe Goldman ba8a8d93c3 BFLATN -> BSATN fast-path for fixed-length rows (#1005)
* Implement (but do not use) a fast path for BFLATN -> BSATN conversion

* fmt and clippy

* `u16` offset rather than `usize`

* Address Joshua's review

* Define methods on `RowRef` and `RelValue` which use the new serializer

* Comment in `align_to` about div-by-zero

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>

* Add benchmark comparing BFLATN -> BSATN with and without the fast path

* Add benchmark on `u64_u64_u32`, which has less interior padding than `u32_u64_u64`

* Remove `to_len` from `to_bsatn_extend`

It turns out to be slower than just eating the `realloc`s.

* Remove unused `to_bsatn_slice`

I thought I would need it, but it ended up not being useful.

* Expand comment with example; `Box<[...]>` to reduce memory footprint

* Comments from Mazdak's review

---------

Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-03-25 19:46:10 +00:00
joshua-spacetime 47e787877f test(1099): Multi-column index selection through query macro (#1001) 2024-03-21 23:33:13 +00:00
Phoebe Goldman 02aeac7fdc Don't do alignment computations during BFLATN ser/de (#986)
`AlgebraicTypeLayout` and friends already include full layout information,
including properly-aligned offsets for `ProductTypeElementLayout`s.
As such, there's no need to do any alignment computation
during `serialize_value` or `write_value`.

Instead, while traversing a `ProductTypeLayout`,
we can use each element's `offset` to update the `curr_offset`.
2024-03-18 14:25:59 +00:00
Mario Montoya 891f6b8931 Truly remove perfcnt (#946) 2024-03-08 20:26:30 +00:00
james gilles 1611d10713 Remove perfcnt for now (#941) 2024-03-07 21:16:58 +00:00
Mazdak Farrokhzad 913801e22a - Make RelValue into a cow-like structure. (#869)
- Move it and friends from sats to vm.
- MemTable now stores a Vec<PV>.
- Other related improvements.

Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>
2024-02-21 20:07:39 +00:00
Mazdak Farrokhzad 04120e778a system_tables, mut_tx, and friends: bye bye to_product_value (#851)
* optimize build_missing_tables; collect less + use read_col

* schema_for_table: don't go through PV

* table_id_from_name: use read_col

* relational_db tests: remove uses of to_product_value

* build_sequence_state: don't use to_pv

* build_indexes: don't use to_pv

* remove dead code: CommittedStateIter

* get_all_tables_tx: use read_col

* SystemTableQuery: don't use to_pv

* StModuleRow: don't use to_pv

* get rd of more to_product_value calls

* read_col / system_tables / mut_tx: cleanup + less work2
2024-02-19 19:21:13 +00:00
Mazdak Farrokhzad 5ab4342187 Refactor some ReadColumn stuff + relational_db tests (#847)
* refactor with read_col method + simplify InvalidFieldError creation

* simplify + dedup relational_db tests

* relational_db: dedup tests & nix some to_product_value call

* dedup relational_db tests more
2024-02-19 01:57:36 +00:00
Noa e6cef1b627 Fix bench errors and include in CI (#855) 2024-02-16 19:15:13 +00:00