* Add the `snapshot` crate, which implements snapshotting at a low level
- Requires making `BlobHash` be `Serialize` and `Deserialize`.
For arcane macro-ology reasons, this requires writing `BlobHash::SIZE`
instead of `Self::SIZE` (it gets embedded in a visitor struct or something).
- Requires adding two new operators to `BlobStore`.
- Adds a return value to `Page::save_content_hash`, for convenience.
- Impls `DerefMut` for `Pages`.
- **Scary change:** adds `Table::pages_mut`.
I think possibly this operator should be `unsafe`,
since write access to the `Pages` allows an undisciplined caller
to violate the `Table`'s assumptions by corrupting a `Page`.
It seems like an anti-pattern to mark a method `unsafe` on the grounds that
misusing its return value can cause UB,
but I don't see a plausible alternative
without making most methods on `Page` unsafe.
Open to feedback on this one!
* Nix `Table::pages_mut`
* Address Mazdak's feedback
* Use `thiserror` rather than `anyhow` for better error hygiene
* Impl `Serialize`, `Deserialize` for `Page`
Snapshotting needs to write `Page`s to files and read them back again.
To that effect, this commit implements `Serialize` and `Deserialize` for `Page`.
* Address Mazdak's review
- Fix soundness in `FixedBitSet` by moving an assert.
- Add commentary to test.
- Add commentary to `spacetimedb-lib` dependency.
* Make `Page` always fully init
Per discussion on the snapshotting proposal,
this PR changes the type of `Page.row_data` to `[u8; _]`,
where previously it was `[MaybeUninit<u8>; _]`.
This turns out to be shockingly easy,
as our serialization codepaths never write padding bytes into a page.
The only place pages ever became `poison` was the initial allocation;
changing this to `alloc_zeroed` causes the `row_data` to always be valid at `[u8; _]`.
The majority of this diff is replacing `MaybeUninit`-specific operators
with their initialized equivalents,
and updating comments and documentation to reflect the new requirements.
This change also revealed a bug in the benchmarks
introduced when we swapped the order of sum tags and payloads
( https://github.com/clockworklabs/SpacetimeDB/pull/1063 ),
where benchmarks used a hardcoded offset for the tag which had not been updated.
* Update blake3
Blake3 only supports running under Miri as of 1.15.1, the latest version.
Prior versions hard-depended on SIMD intrinsics which Miri doesn't support.
* Address Mazdak's review.
Still pending his agreeing with me that `poison` is a better name than `uninit`.
* "Poison" -> "uninit"
Against my best wishes, for consistency with the broader Rust community's poor choices.
* Remove unnecessary `unsafe` blocks
* More unnecessary `unsafe`; remove forgotten SAFETY comments
2. Make `RowRef::row_hash` use the above.
3. Make `Table::insert` return a `RowRef`.
4. Use less unsafe because of 1-3.
5. Use `second-stack` to reuse temporary allocations in hashing and serialization.
- Arcing `TableSchema`, and this has benefits elsewhere too.
- Arc<[_]>ing the visitor program instructions.
The data behind the Arcs very rarely change,
which is the perfect case for an Arc.
* Swap the location of tags to go before variant data in the BFLATN encoding
* Fix a comment
* Apply suggestions from code review (@gefjon @Centril)
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: james gilles <jameshgilles@gmail.com>
* Implement memcpy consolidation for sums
* Vanquish clippy
---------
Signed-off-by: james gilles <jameshgilles@gmail.com>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
* Implement (but do not use) a fast path for BFLATN -> BSATN conversion
* fmt and clippy
* `u16` offset rather than `usize`
* Address Joshua's review
* Define methods on `RowRef` and `RelValue` which use the new serializer
* Comment in `align_to` about div-by-zero
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
* Add benchmark comparing BFLATN -> BSATN with and without the fast path
* Add benchmark on `u64_u64_u32`, which has less interior padding than `u32_u64_u64`
* Remove `to_len` from `to_bsatn_extend`
It turns out to be slower than just eating the `realloc`s.
* Remove unused `to_bsatn_slice`
I thought I would need it, but it ended up not being useful.
* Expand comment with example; `Box<[...]>` to reduce memory footprint
* Comments from Mazdak's review
---------
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
`AlgebraicTypeLayout` and friends already include full layout information,
including properly-aligned offsets for `ProductTypeElementLayout`s.
As such, there's no need to do any alignment computation
during `serialize_value` or `write_value`.
Instead, while traversing a `ProductTypeLayout`,
we can use each element's `offset` to update the `curr_offset`.
- Move it and friends from sats to vm.
- MemTable now stores a Vec<PV>.
- Other related improvements.
Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>
* optimize build_missing_tables; collect less + use read_col
* schema_for_table: don't go through PV
* table_id_from_name: use read_col
* relational_db tests: remove uses of to_product_value
* build_sequence_state: don't use to_pv
* build_indexes: don't use to_pv
* remove dead code: CommittedStateIter
* get_all_tables_tx: use read_col
* SystemTableQuery: don't use to_pv
* StModuleRow: don't use to_pv
* get rd of more to_product_value calls
* read_col / system_tables / mut_tx: cleanup + less work2