SpacetimeDB

mirror of https://github.com/clockworklabs/SpacetimeDB.git synced 2026-05-23 08:01:08 -04:00

Author	SHA1	Message	Date
Phoebe Goldman	6c45e76a98	Integrate snapshotting into core (#1344 )	2024-06-11 12:40:02 +00:00
Phoebe Goldman	8c5f40db8d	Add the `snapshot` crate, which implements snapshotting at a low level (#1340 ) * Add the `snapshot` crate, which implements snapshotting at a low level - Requires making `BlobHash` be `Serialize` and `Deserialize`. For arcane macro-ology reasons, this requires writing `BlobHash::SIZE` instead of `Self::SIZE` (it gets embedded in a visitor struct or something). - Requires adding two new operators to `BlobStore`. - Adds a return value to `Page::save_content_hash`, for convenience. - Impls `DerefMut` for `Pages`. - Scary change: adds `Table::pages_mut`. I think possibly this operator should be `unsafe`, since write access to the `Pages` allows an undisciplined caller to violate the `Table`'s assumptions by corrupting a `Page`. It seems like an anti-pattern to mark a method `unsafe` on the grounds that misusing its return value can cause UB, but I don't see a plausible alternative without making most methods on `Page` unsafe. Open to feedback on this one! * Nix `Table::pages_mut` * Address Mazdak's feedback * Use `thiserror` rather than `anyhow` for better error hygiene	2024-06-05 21:58:12 +00:00
Phoebe Goldman	a214f78f0b	Impl `Serialize`, `Deserialize` for `Page` (#1335 ) * Impl `Serialize`, `Deserialize` for `Page` Snapshotting needs to write `Page`s to files and read them back again. To that effect, this commit implements `Serialize` and `Deserialize` for `Page`. * Address Mazdak's review - Fix soundness in `FixedBitSet` by moving an assert. - Add commentary to test. - Add commentary to `spacetimedb-lib` dependency.	2024-06-04 15:49:27 +00:00
Shubham Mishra	cf4b9aa282	metric for table size (#1319 ) * table size metric * feld blob_store_bytes in table * address comments * NumBlobBytes type * table size metrics: adjust comments, visibility + harden test --------- Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-05-31 17:44:12 +00:00
Mazdak Farrokhzad	25513b37b9	Simplify `btree_index` module with more idiomatic Rust (#1285 ) * simplify btree_index module, more idiomatic Rust * test	2024-05-23 13:39:48 +00:00
joshua-spacetime	88a8adad70	feat(1231): Basic query cardinality estimation (#1273 ) * feat(1231): Basic query cardinality estimation This patch implements basic cardinality estimation for QueryExpr. It utilizes table cardinalities and number of distinct values for index related operators. * estimation tests: dedup + define constants for readability * row_est: simplify with slice patterns * fn ndv -> fn num_distinict_values * simplify TypedIndex::num_keys * is_range -> is_point (invert) + fuse arms in row_est * estimation: fix logic for IndexJoin --------- Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-05-23 08:49:07 +00:00
Mazdak Farrokhzad	ebc921849e	privatize Table::row_layout + related BTreeIndex refactoring (#1262 )	2024-05-20 18:44:04 +00:00
Mazdak Farrokhzad	91f7e8c917	add PageHeader::unmodified_hash, a BLAKE3 hash for snapshotting (#1249 )	2024-05-20 17:47:42 +00:00
Mazdak Farrokhzad	e109385c1e	remove BTreeIndex::name again (#1251 )	2024-05-20 15:52:57 +00:00
Mazdak Farrokhzad	d188f966c2	move static bsatn layout to module + harden test (#1254 )	2024-05-20 15:52:35 +00:00
Mazdak Farrokhzad	0b89165cec	- Table::get_fixed_row -> RowRef::get_row_data (#1250 ) - Document some table methods	2024-05-20 07:14:35 +00:00
Shubham Mishra	0c0567ecbf	row_count field to table (#1242 ) * rowcount * added tests	2024-05-17 17:48:58 +00:00
Mazdak Farrokhzad	5154f7969e	to_bsatn_extend/vec: use uninit instead of zeroed buffer (#1204 )	2024-05-14 17:03:16 +00:00
Noa	3b754f10b1	Bump to Rust 1.78 (#1205 ) * Bump to rust 1.78 * Fix lints	2024-05-08 14:20:12 +00:00
Phoebe Goldman	484ba824ba	Make `Page` always fully init (#1193 ) * Make `Page` always fully init Per discussion on the snapshotting proposal, this PR changes the type of `Page.row_data` to `[u8; _]`, where previously it was `[MaybeUninit<u8>; _]`. This turns out to be shockingly easy, as our serialization codepaths never write padding bytes into a page. The only place pages ever became `poison` was the initial allocation; changing this to `alloc_zeroed` causes the `row_data` to always be valid at `[u8; _]`. The majority of this diff is replacing `MaybeUninit`-specific operators with their initialized equivalents, and updating comments and documentation to reflect the new requirements. This change also revealed a bug in the benchmarks introduced when we swapped the order of sum tags and payloads ( https://github.com/clockworklabs/SpacetimeDB/pull/1063 ), where benchmarks used a hardcoded offset for the tag which had not been updated. * Update blake3 Blake3 only supports running under Miri as of 1.15.1, the latest version. Prior versions hard-depended on SIMD intrinsics which Miri doesn't support. * Address Mazdak's review. Still pending his agreeing with me that `poison` is a better name than `uninit`. * "Poison" -> "uninit" Against my best wishes, for consistency with the broader Rust community's poor choices. * Remove unnecessary `unsafe` blocks * More unnecessary `unsafe`; remove forgotten SAFETY comments	2024-05-02 23:15:48 +00:00
Mazdak Farrokhzad	e526c8c113	Fix soundness hole in `Table::delete` + don't make & immedately drop PVs in the method (#1162 ) * impl Eq + Hash for RelValue * Use Hash for RelValue in incr-eval * naming: spell out pv, rv, and tro * fix soundness hole in Table::delete + don't make + drop PVs * Clarify `Table::delete`'s callback `before` Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io> Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com> --------- Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2024-04-30 22:30:50 +00:00
Mazdak Farrokhzad	0142e14de5	Implement `RelValue: Eq + Hash` (#1107 ) * impl Eq + Hash for RelValue * Use Hash for RelValue in incr-eval * naming: spell out pv, rv, and tro	2024-04-30 22:13:50 +00:00
Mazdak Farrokhzad	b55121cc83	use a custom FixedBitSet + optimize Page::iter_fixed_len (#1160 )	2024-04-30 21:57:28 +00:00
Mazdak Farrokhzad	2c07b3bd69	`impl PartialEq<ProductValue> for RowRef` (#1164 ) * impl PartialEq<ProductValue> for RowRef * Apply Phoebe's suggestions Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io> Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com> --------- Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2024-04-30 21:20:32 +00:00
Mazdak Farrokhzad	fd44242e99	1. Add `Hash for RowRef` + make it consistent with PV. (#1163 ) 2. Make `RowRef::row_hash` use the above. 3. Make `Table::insert` return a `RowRef`. 4. Use less unsafe because of 1-3. 5. Use `second-stack` to reuse temporary allocations in hashing and serialization.	2024-04-30 17:59:58 +00:00
Mazdak Farrokhzad	516dfe376c	impl Eq for RowRef (#1135 )	2024-04-25 01:09:26 +00:00
Mazdak Farrokhzad	cb0c09bab0	Define `Hash + Eq` for BSATN (#1112 ) * add hash_bsatn + move proptest generators to sats crate * add eq_bsatn	2024-04-24 23:06:22 +00:00
Mazdak Farrokhzad	9797695ef6	multimap: don't sort values, use push & swap_remove (#1029 )	2024-04-22 10:01:17 +00:00
Mazdak Farrokhzad	f560101551	Make Table::clone_structure cheaper by: (#1090 ) - Arcing `TableSchema`, and this has benefits elsewhere too. - Arc<[_]>ing the visitor program instructions. The data behind the Arcs very rarely change, which is the perfect case for an Arc.	2024-04-16 19:07:36 +00:00
Mazdak Farrokhzad	d6815ebf9c	Shrink AV and AT to 24 & 16 bytes respectively, and also friends. (#1047 )	2024-04-13 16:51:18 +00:00
james gilles	1c2e63e0a4	Table: skip alignment checks in eq_row_in_page and hash_row_in_page (#1085 ) * Table: skip alignment checks in eq_row_in_page and hash_row_in_page * Whoops, those comments can stay the same.	2024-04-12 16:05:34 +00:00
james gilles	b9cee3d09f	Swap the location of tags in the BFLATN encoding (#1063 ) * Swap the location of tags to go before variant data in the BFLATN encoding * Fix a comment * Apply suggestions from code review (@gefjon @Centril) Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io> Signed-off-by: james gilles <jameshgilles@gmail.com> * Implement memcpy consolidation for sums * Vanquish clippy --------- Signed-off-by: james gilles <jameshgilles@gmail.com> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2024-04-11 20:10:00 +00:00
Mazdak Farrokhzad	344861f290	use nohasher_hash and ahash instead of siphash13 (#1040 ) * use nohasher_hash and ahash instead of siphash13 * re-export types in spacetimedb_data_structures::map	2024-04-05 17:30:51 +00:00
Mazdak Farrokhzad	b6c0e1c4d8	Add `AlgebraicValue::take` + move test-code in `btree_index` to tests (#1028 ) * add AlgebraicValue::take for a neater interface * btree_index: move test-only code to tests	2024-03-27 17:05:19 +00:00
Mazdak Farrokhzad	9141a42622	Bump Rust to 1.77 + fix warnings + use `Bound::map` (#1020 ) * bump Rust to 1.77 + fix warnings + use Bound::map * use .truncate(true) for OpenOptions	2024-03-25 20:27:08 +00:00
Phoebe Goldman	ba8a8d93c3	BFLATN -> BSATN fast-path for fixed-length rows (#1005 ) * Implement (but do not use) a fast path for BFLATN -> BSATN conversion * fmt and clippy * `u16` offset rather than `usize` * Address Joshua's review * Define methods on `RowRef` and `RelValue` which use the new serializer * Comment in `align_to` about div-by-zero Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> * Add benchmark comparing BFLATN -> BSATN with and without the fast path * Add benchmark on `u64_u64_u32`, which has less interior padding than `u32_u64_u64` * Remove `to_len` from `to_bsatn_extend` It turns out to be slower than just eating the `realloc`s. * Remove unused `to_bsatn_slice` I thought I would need it, but it ended up not being useful. * Expand comment with example; `Box<[...]>` to reduce memory footprint * Comments from Mazdak's review --------- Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-03-25 19:46:10 +00:00
joshua-spacetime	47e787877f	test(1099): Multi-column index selection through query macro (#1001 )	2024-03-21 23:33:13 +00:00
Phoebe Goldman	02aeac7fdc	Don't do alignment computations during BFLATN ser/de (#986 ) `AlgebraicTypeLayout` and friends already include full layout information, including properly-aligned offsets for `ProductTypeElementLayout`s. As such, there's no need to do any alignment computation during `serialize_value` or `write_value`. Instead, while traversing a `ProductTypeLayout`, we can use each element's `offset` to update the `curr_offset`.	2024-03-18 14:25:59 +00:00
Mario Montoya	891f6b8931	Truly remove perfcnt (#946 )	2024-03-08 20:26:30 +00:00
james gilles	1611d10713	Remove perfcnt for now (#941 )	2024-03-07 21:16:58 +00:00
Mazdak Farrokhzad	913801e22a	- Make `RelValue` into a cow-like structure. (#869 ) - Move it and friends from sats to vm. - MemTable now stores a Vec<PV>. - Other related improvements. Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>	2024-02-21 20:07:39 +00:00
Mazdak Farrokhzad	04120e778a	system_tables, mut_tx, and friends: bye bye `to_product_value` (#851 ) * optimize build_missing_tables; collect less + use read_col * schema_for_table: don't go through PV * table_id_from_name: use read_col * relational_db tests: remove uses of to_product_value * build_sequence_state: don't use to_pv * build_indexes: don't use to_pv * remove dead code: CommittedStateIter * get_all_tables_tx: use read_col * SystemTableQuery: don't use to_pv * StModuleRow: don't use to_pv * get rd of more to_product_value calls * read_col / system_tables / mut_tx: cleanup + less work2	2024-02-19 19:21:13 +00:00
Mazdak Farrokhzad	5ab4342187	Refactor some ReadColumn stuff + relational_db tests (#847 ) * refactor with read_col method + simplify InvalidFieldError creation * simplify + dedup relational_db tests * relational_db: dedup tests & nix some to_product_value call * dedup relational_db tests more	2024-02-19 01:57:36 +00:00
Noa	e6cef1b627	Fix bench errors and include in CI (#855 )	2024-02-16 19:15:13 +00:00
John Detter	79fd5ad75e	Fix publish issues 0.8.2 (#853 ) * Fix some issues related to publishing 0.8.2 * Added phoebe's fix * Reverting file --------- Co-authored-by: John Detter <no-reply@boppygames.gg>	2024-02-15 23:29:37 +00:00
Tyler Cloutier	faf8766b72	This fixes replaying of the transaction log to no longer check constraints (#806 ) * This fixes replaying of the transaction log to no longer check constraints * Fixed based on Phoebe and Kim's comments * Cargo fmt * Cargo fmt * This fixes index updating for replaying deletes --------- Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>	2024-02-12 21:42:29 +00:00
Noa	a7a707c0f2	Fix build errors in spacetimedb_table benches (#787 )	2024-02-07 21:20:10 +00:00
Noa	37658aae7e	Add rust-version to Cargo.toml (#802 ) * Add rust-version to Cargo.toml * Use workspace inheritance to make bumping the spacetime version easier	2024-02-07 20:02:05 +00:00
Phoebe Goldman	c6ae09676e	BTreeIndex: use `ReadColumn` instead of `ProductValue` fields (#796 ) * BTreeIndex: Specialize on primitive key types for great performance Rewrite of Tyler's #771, because I thought this would be easier than rebasing. This commit hoists branching on key types outside of comparisons and searches in `BTreeIndex`, so that in many cases we can use e.g. `u64::cmp` instead of the much slower `AlgebraicValue::cmp`. This design is kind of ugly, and will likely change in the future, but for now it's good enough, and is a meaningful performance improvement. * BTreeIndex: use `ReadColumn` instead of `ProductValue` fields Prior to this commit, `BTreeIndex::insert` and `BTreeIndex::delete` took a `&ProductValue`, the row to be inserted or deleted, as an argument, and extracted the indexed column value(s) from it. With this commit, we instead take a `RowRef` referring to the row to be inserted or deleted, and access the indexed column value(s) using `ReadColumn`. For specialized indexes, we extract them directly as native types, rather than as `AlgebraicValue`. Making this change involved a nasty split borrow problem, as we now need to have a `RowRef` to the inserted/deleted row coexisting with a mutable borrow of the index to be modified. This required introducing a `TableInner` struct, which is the referent of `RowRef`. * Lints: superfluous borrows	2024-02-07 15:55:45 +00:00
Phoebe Goldman	b8cecb3caf	BTreeIndex: Specialize on primitive key types for great performance (#793 ) Rewrite of Tyler's #771, because I thought this would be easier than rebasing. This commit hoists branching on key types outside of comparisons and searches in `BTreeIndex`, so that in many cases we can use e.g. `u64::cmp` instead of the much slower `AlgebraicValue::cmp`. This design is kind of ugly, and will likely change in the future, but for now it's good enough, and is a meaningful performance improvement.	2024-02-06 17:57:45 +00:00
Phoebe Goldman	cf6aa6dde0	`trait ReadColumn`, to read a single column from a `RowRef` (#789 ) This commit defines `trait ReadColumn`, which is implemented for types that can be stored in a table column and can be read out of said table column. Its interesting method is `read_column(row: RowRef<'_>, idx: usize) -> Result<Self, TypeError>`, which attempts to read the `idx`th column of `row` as a value of type `Self`, returning a `TypeError` if the row in question does not have the appropriate types. `ReadColumn` is implemented for Rust equivalents of all non-compound `AlgebraicType`s, i.e. integers, floats, `bool` and `String`. It is also implemented for all SATS value types, including those which represent values of compound types, i.e. `AlgebraicValue`, `ProductValue`, `SumValue`, `ArrayValue` and `MapValue`.	2024-02-05 20:45:51 +00:00
Mazdak Farrokhzad	b336fba700	Integrate mem arch into `locking_tx_datastore` (#769 ) * datastore: impl mem arch using spacetimedb_table * instance_env/iters: write bflatn -> bsatn directly * Make tests build, fix 1 failure (of 4) Commit message from SpacetimeDBPrivate 0ae26519d181cb4b13f748af53d036c322d4f642: Bug fix in `delete_by_rel`: don't trigger unique constraints in dummy insert Prior to this commit, `MutTxId::delete_by_rel` on a table with a unique constraint would fail, because the `delete_by_rel` would attempt an insert of the row to be deleted, with the intention of rolling back the insertion later. This would trigger the unique constraint, aborting the `delete_by_rel`, and the actual row would not be deleted. With this commit, `MutTxId::delete_by_rel` properly uses `Table::insert_internal_allow_duplicate` and `Table::delete_internal_skip_pointer_map` to bypass unique constraints. This requires exporting some methods from `Table` which were previously internal. * Remove forgotten `dbg` * `bflatn_to`: panic rather than UB on an ill-typed `ProductValue` The docs in `bflatn_to.rs` claim that we'll panic if the `ProductValue` passed to `write_row_to_pages` does not conform to the type. Prior to this commit, we would panic if some element of the `val` had a different type constructor than the corresponding element of the `ty`, but for `ProductValue`, we would not panic if the value had more or fewer elements than the type. When the value had more elements, this was merely a logic bug; we'd drop the tail elements and go about our lives. When the value had fewer elements, this invoked UB; we'd leave the remainder of the row in the page uninit, and later access uninitialized memory e.g. when hashing the row. This commit adds a check that in `write_product` that the `val` and `ty` have the same length. If they do not, this commit panics. This will need to be revised to return an `Err`, as spacetimedb-core has tests that attempting to insert an ill-typed value returns an `Err` rather than panicking. * Error handling, runtime schema mutations This commit makes two sorts of changes: - It begins filling in the error-handling story of the new mem-arch by defining a few new error enums in the `spacetimedb-table` crate, and converting them into `TableError` in `spacetimedb-core`. - It implements `drop_table`, `rename_table` and friends to mutate a schema at runtime. These operations are broken in master in that they violate transactionality, and they're broken here too. * Re-enable metrics * Resolve some TODOs, incl. error type in `page.rs` * Fixes for Tyler's comments - Restrict some `pub` to `pub(crate)` - Rename `MutTxId::delete_by_rel` to `delete_by_row_value`, since it deletes only one row. - `relations` -> `relation`. - Rename `ignore_duplicate_insert` -> `ignore_duplicate_insert_error` and add docs. - nix `with_table_` methods in favor of imperative `get_table_`. * Note correctness of `state_view::Iter` and add test in `relational_db`. Tyler had concerns about the behavior of `state_view::Iter` w.r.t. the edge case of a committed-deleted-inserted row, since the new datastore maintains the invariant that the committed and inserted tables are disjoint, where the previous datastore did not. This commit adds comments describing that invariant to the iterator, and adds a test in `relational_db` that the external semantics are preserved. * Add notes for future MVCC implementors re: disjoint committed and insert tables Tyler convinced me via video call that, in MVCC, it is no longer valid to treat the committed and insert tables as disjoint, i.e. to elide inserts of already-present rows. Doing so is still valid, and is more performant, in the current locking model, so we retain that behavior here. This commit adds comments to relevant code paths so that, when we move to MVCC, we do not forget to remove the optimization. --------- Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>	2024-02-01 19:43:17 +00:00
Mazdak Farrokhzad	097937bb0e	pointers_yielded -> num_pointers_yielded (#773 )	2024-01-28 09:20:04 +00:00
Mazdak Farrokhzad	145e588c36	Split old datastore more (#770 ) * 1. sort some `Cargo.toml`s 2. expose 'pointers_yielded' 3. move SequenceError * datastore: TxId -> tx.rs * datastore: extract datastore.rs * fix test imports	2024-01-28 04:14:49 +00:00
Tyler Cloutier	ecd366e266	Fixed issue with perfcnt not working on certain platforms and architectures (#768 ) * Fixed issue with perfcnt not working on certain platforms and architectures * Update var_len_visitor.rs	2024-01-27 10:06:52 +00:00

1 2

52 Commits