Commit Graph

151 Commits

Author SHA1 Message Date
Mario Alejandro Montoya Cortés 26f4989a67 Experimental parser with forked sqlparser 2024-09-09 13:54:07 -05:00
Ingvar Stepanyan 607f7ce6b8 Auto-generate C# ModuleDef bindings from Rust (#1680) 2024-09-09 18:22:17 +00:00
Mario Montoya 8ee1de60b4 Update SQL AST in accordance with the SQL spec (#1623)
Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-08-29 23:44:27 +00:00
james gilles 2be42156b2 Allow converting new ModuleDef to old TableSchema (#1630) 2024-08-29 18:17:29 +00:00
james gilles 98faf7d4e1 Add backwards-compatible validation for RawModuleDefV8. (#1606) 2024-08-29 10:10:53 +00:00
james gilles 6e6bee2a9c ABI v9 validation code (#1572)
Signed-off-by: james gilles <jameshgilles@gmail.com>
2024-08-27 20:02:26 +00:00
Mazdak Farrokhzad 3be5c83d99 [WASM ABI 1.0] __call_reducer__ receives Identity & Address by value (#1607)
Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>
Co-authored-by: Ingvar Stepanyan <me@rreverser.com>
Co-authored-by: Noa <coolreader18@gmail.com>
2024-08-19 22:20:57 +00:00
james gilles b1f80e0ffe Preliminary Identifier validation (#1584) 2024-08-14 18:00:26 +00:00
Zeke Foppa 2b7b9ff350 Bump version to 0.12.0 (#1578)
Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
2024-08-09 19:10:53 +00:00
Mazdak Farrokhzad 1e8e18d74b Add support for I256 and U256 (#1477) 2024-08-08 18:40:35 +00:00
james gilles c046c0b6aa Add ErrorStream combinator (#1543)
Co-authored-by: James Gilles <jgilles@clockworklabs.io>
2024-07-26 19:42:21 +00:00
james gilles 45b2ceee9a Move schemas to schema crate, rename Def to RawDefV8 (#1498) 2024-07-24 17:38:30 +00:00
Phoebe Goldman 1d26575d87 ST sequences: respect allocated amount on restart (#1532)
Co-authored-by: Kim Altintop <kim@eagain.io>
2024-07-24 14:43:07 +00:00
Zeke Foppa 417adb8820 Bump version to 0.11.0 (#1531)
Co-authored-by: Zeke Foppa <github.com/bfops>
2024-07-22 20:48:03 +00:00
james gilles f81f2a7492 Move db module from spacetimedb_sats to spacetimedb_lib (#1479) 2024-07-17 20:59:44 +00:00
Noa 10b151b999 Protobufectomy: server (#1077)
Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Jeremie Pelletier <jeremiep@gmail.com>
2024-07-12 18:02:18 +00:00
Zeke Foppa dcc70b82f2 Bump version to 0.10.1 (#1443)
Co-authored-by: Zeke Foppa <github.com/bfops>
2024-06-17 23:34:13 +00:00
Mazdak Farrokhzad b1442fc2f1 HACK: Tweak schema_updates to allow adding/removing non-unique indices (#1434) 2024-06-14 14:40:14 +00:00
Noa 66112bbdf0 Impl subscribe subcommand & subscription smoketests (#1343)
Signed-off-by: Kim Altintop <kim@eagain.io>
Co-authored-by: Kim Altintop <kim@eagain.io>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-06-14 09:16:05 +00:00
Phoebe Goldman 6c45e76a98 Integrate snapshotting into core (#1344) 2024-06-11 12:40:02 +00:00
Zeke Foppa 2d09485f74 Bump version to 0.10.0 (#1349)
Co-authored-by: Zeke Foppa <github.com/bfops>
2024-06-10 16:25:02 +00:00
Noa a54399495d Prune bindings deps (#1290) 2024-06-07 20:36:33 +00:00
Zeke Foppa 8f3f6bd9d7 Fix Config.save failing if /tmp is on a different filesystem (#1346)
* [bfops/fix-config-saving]:  do thing

* [bfops/fix-config-saving]: review

* [bfops/fix-config-saving]: fix smoketests

* [bfops/fix-config-saving]: use create_new to avoid race condition

---------

Co-authored-by: Zeke Foppa <github.com/bfops>
2024-06-06 16:28:17 +00:00
Zeke Foppa b06b2e59f1 Fix bug with Lockfile sticking around (#1341)
* [bfops/fix-config-lock]: do thing

* [bfops/fix-config-lock]: review

* [bfops/fix-config-lock]: review

* [bfops/fix-config-lock]: fix

* [bfops/fix-config-lock]: TODOs

* [bfops/fix-config-lock]: review

* [bfops/fix-config-lock]: review

* [bfops/fix-config-lock]: review

* [bfops/fix-config-lock]: review

---------

Co-authored-by: Zeke Foppa <github.com/bfops>
2024-06-05 22:40:24 +00:00
Phoebe Goldman 8c5f40db8d Add the snapshot crate, which implements snapshotting at a low level (#1340)
* Add the `snapshot` crate, which implements snapshotting at a low level

- Requires making `BlobHash` be `Serialize` and `Deserialize`.
  For arcane macro-ology reasons, this requires writing `BlobHash::SIZE`
  instead of `Self::SIZE` (it gets embedded in a visitor struct or something).
- Requires adding two new operators to `BlobStore`.
- Adds a return value to `Page::save_content_hash`, for convenience.
- Impls `DerefMut` for `Pages`.
- **Scary change:** adds `Table::pages_mut`.
  I think possibly this operator should be `unsafe`,
  since write access to the `Pages` allows an undisciplined caller
  to violate the `Table`'s assumptions by corrupting a `Page`.
  It seems like an anti-pattern to mark a method `unsafe` on the grounds that
  misusing its return value can cause UB,
  but I don't see a plausible alternative
  without making most methods on `Page` unsafe.
  Open to feedback on this one!

* Nix `Table::pages_mut`

* Address Mazdak's feedback

* Use `thiserror` rather than `anyhow` for better error hygiene
2024-06-05 21:58:12 +00:00
Phoebe Goldman f9cc84e3b4 Define DirTrie, a git-like on-disk object store (#1336)
* Define `DirTrie`, a git-like on-disk object store

* Remove unused iteration code; add simple tests

* Address Mazdak's review
2024-06-05 17:16:59 +00:00
Phoebe Goldman db34ff6a8e Create new crate fs-utils; move Lockfile and create_parent_dir (#1334)
* Create new crate `fs-utils`; move `Lockfile` and `create_parent_dir`

The snapshot crate will need to create lockfiles.
Rather than duplicating code to do so, we choose to move our definition of `Lockfile`
into a crate that can be depended on by both `cli` and `snapshot`.
No existing crate seems like an obvious choice for this
-- a `Lockfile` is not really a data structure, so `data-structures` seems wrong --
so we add a new crate, `fs-utils`.
Currently this contains only `Lockfile` and `create_parent_dir`,
but a follow-up PR will add `DirTrie`, a Git-like on-disk object store.

* Deduplicate `map_err` closure

* Zeke's nit: simplify control flow

Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>

---------

Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
2024-06-04 16:57:28 +00:00
Phoebe Goldman a214f78f0b Impl Serialize, Deserialize for Page (#1335)
* Impl `Serialize`, `Deserialize` for `Page`

Snapshotting needs to write `Page`s to files and read them back again.
To that effect, this commit implements `Serialize` and `Deserialize` for `Page`.

* Address Mazdak's review

- Fix soundness in `FixedBitSet` by moving an assert.
- Add commentary to test.
- Add commentary to `spacetimedb-lib` dependency.
2024-06-04 15:49:27 +00:00
Noa f8beb699c7 Implement new rand api (#1283)
* Implement new rand api

* Address comments
2024-05-31 17:42:58 +00:00
Zeke Foppa 0bc42bfe48 Bump version to 0.9.3 (#1224)
* [bfops/bump-version]: empty

* [bfops/bump-version]: version bump

* [bfops/bump-version]: update

* [bfops/bump-version]: bump C# module versions too

* [bfops/bump-version]: bump 0.9.3

---------

Co-authored-by: Zeke Foppa <github.com/bfops>
2024-05-29 17:22:51 +00:00
Kim Altintop 2c3fc66f21 Commitlog: panic on fsync failure (#985)
* commitlog: Panic on fsync failure

Errors returned by `fsync(2)` are particularly nefarious, as it is
mostly undefined what the state of the page cache is in this case.

Since the log is synced asynchronously and not after every write, it is
impossible to know up to which commit data can be considered durable --
except by reading the most recent segment from disk.

Therefore, the reasonable thing to do is to prevent any further use of
the log, and force users to re-load it from disk.

Note that this is only half of the solution: an application restart may
still read data from the page cache, which could be gone after a system
restart.

To fix this, we would need to employ direct I/O (i.e. `O_DIRECT`), which
however is beyond the scope of this patch as it invalidates the use of
most of `std::io`.

* commitlog: Handle duplicate commits when iterating

We cannot exclude the possibility of a false failure in I/O operations.
In particular, `EIO` errors are difficult to attribute to a particular
write, as they happen asynchronously during flush of the page cache.

Because we do not bypass the page cache, the possibility exists that a
particular commit is lost when it isn't, or that it is considered
durable when it isn't. The former could lead to duplicate commits
appearing in the log, while the latter could lead to a matching offset
number, but with different commit payload.

This patch thus ignores duplicates, and introduces a new error variant
in the event the offset matches but the checksum doesn't.

* durability: Manage the flush-and-sync task in this crate

Since syncing the commitlog may now panic, it is more obvious to handle
all async tasks here, so as to be able to handle the panic cases.

Namely, if the `FlushAndSyncTask` panics, the `PersisterTask` is
aborted. This will lead to the channel receiver being dropped, which in
turn will cause the next `append_tx` call to panic.

* commitlog: Remove async flush-and-sync

Due to panic behaviour, it is now preferable to manage periodic sync at
the use site of the commitlog crate.

Hence remove `flush_and_sync_every` method, and with it the dependency
on tokio.
2024-05-28 18:22:38 +00:00
Noa 55b7cbe486 Let ProgramStorage::external be async (#1291)
* Let ProgramStorage::external be async

* Remove core::object_db

* Remove odb_rocksdb feature

* Fix typo

* More resilient conflict avoidance
2024-05-24 21:28:54 +00:00
Kim Altintop 2de147522d core: Collapse DBIC into HostController (#1186)
Make it so `HostController` manages both the module host (wasm
machinery) and the database (`RelationalDB` / `DatabaseInstanceContext`)
of spacetime databases deployed to a server.

The `DatabaseInstanceContextController` (DBIC) is removed in the
process.

This allows to make database accesses panic-safe, in that uncaught
panics will cause all resouces to be released and the database to be
restarted on subsequent access. This is a prerequisite for #985.

It also allows to move towards storage of the module binary directly in
the database / commitlog. This patch, however, makes some contortions in
order to **not** introduce a breaking change just yet.
2024-05-21 17:30:02 +00:00
Mazdak Farrokhzad 91f7e8c917 add PageHeader::unmodified_hash, a BLAKE3 hash for snapshotting (#1249) 2024-05-20 17:47:42 +00:00
Zeke Foppa c7f191fb5d Bump version to 0.9.0 (#1055)
* [bfops/bump-version]:

* [bfops/bump-version]: bump lockfile

---------

Co-authored-by: Zeke Foppa <github.com/bfops>
2024-05-13 18:28:17 +00:00
Phoebe Goldman 484ba824ba Make Page always fully init (#1193)
* Make `Page` always fully init

Per discussion on the snapshotting proposal,
this PR changes the type of `Page.row_data` to `[u8; _]`,
where previously it was `[MaybeUninit<u8>; _]`.

This turns out to be shockingly easy,
as our serialization codepaths never write padding bytes into a page.
The only place pages ever became `poison` was the initial allocation;
changing this to `alloc_zeroed` causes the `row_data` to always be valid at `[u8; _]`.

The majority of this diff is replacing `MaybeUninit`-specific operators
with their initialized equivalents,
and updating comments and documentation to reflect the new requirements.

This change also revealed a bug in the benchmarks
introduced when we swapped the order of sum tags and payloads
( https://github.com/clockworklabs/SpacetimeDB/pull/1063 ),
where benchmarks used a hardcoded offset for the tag which had not been updated.

* Update blake3

Blake3 only supports running under Miri as of 1.15.1, the latest version.
Prior versions hard-depended on SIMD intrinsics which Miri doesn't support.

* Address Mazdak's review.

Still pending his agreeing with me that `poison` is a better name than `uninit`.

* "Poison" -> "uninit"

Against my best wishes, for consistency with the broader Rust community's poor choices.

* Remove unnecessary `unsafe` blocks

* More unnecessary `unsafe`; remove forgotten SAFETY comments
2024-05-02 23:15:48 +00:00
Mazdak Farrokhzad 7c52ef555a clarify cost of dropping the updates in eval_incr (#1192) 2024-05-02 16:47:34 +00:00
Mazdak Farrokhzad b55121cc83 use a custom FixedBitSet + optimize Page::iter_fixed_len (#1160) 2024-04-30 21:57:28 +00:00
Mazdak Farrokhzad fd44242e99 1. Add Hash for RowRef + make it consistent with PV. (#1163)
2. Make `RowRef::row_hash` use the above.
3. Make `Table::insert` return a `RowRef`.
4. Use less unsafe because of 1-3.
5. Use `second-stack` to reuse temporary allocations in hashing and serialization.
2024-04-30 17:59:58 +00:00
Mazdak Farrokhzad cb0c09bab0 Define Hash + Eq for BSATN (#1112)
* add hash_bsatn + move proptest generators to sats crate

* add eq_bsatn
2024-04-24 23:06:22 +00:00
Ingvar Stepanyan 45f6cd6f0c Fix codegen tests (#1146)
While working on the new C# codegen, I accidentally noticed that those tests were passing even when they clearly should've been failing due to changed output.

After running with `--nocapture`, I found out it's because the tests are silently skipped and reported as successful when `rust_wasm_test.wasm` isn't built.

This further led to finding that `rust_wasm_test.wasm` is never built - the relevant module results in `rust_wasm_test_module.wasm` instead - so these tests have been incorrectly passing for ages.

This PR changes them to actually build the module as part of testing and updates the snapshots to latest master.
2024-04-24 14:03:03 +00:00
Kim Altintop 47048559b4 core: Integrate new commitlog + durability (#926)
This patch attempts to integrate the new commitlog with the minimum
changes.

Most of the diff comes from deletions of the legacy log and the need to
adjust tests due to the requirement for a tokio runtime when a durable
database is used in tests.

The "meat" of the patch are the `RelationalDB` constructors,
`RelationalDB::commit_tx`, and the replay logic in
`locking_tx_datastore`.

While `DataKey` is gone, there is still some redundant data being passed
around, which will be addressed in the follow-up patch.
2024-04-11 22:46:31 +00:00
Kim Altintop 02be002416 Durability: Traits and implementation in terms of commitlog (#922)
Defines traits intended to abstract over the kind of persistence a
database utilizes. The only implementation is (host-)local durability in
terms of the new commitlog crate.

The trait definitions may not be considered stable yet, but are in their
tentative form needed for further integration of the new commitlog.
2024-04-11 09:44:58 +00:00
Noa abdaf88563 Move lib::{name,recovery} to client-api-messages (#570) 2024-04-10 20:24:05 +00:00
Phoebe Goldman 6d91d57f3c Detect unsatisfiable range queries; warn and short-circuit. (#1036)
* Detect unsatisfiable range queries; warn and short-circuit.

This commit fixes a panic caused by unsatisfiable range bounds on an index query,
e.g. `WHERE x < 5 AND x > 5`.
These unsatisfiable bounds made Rust's `BTreeMap` angry
(See https://doc.rust-lang.org/src/alloc/collections/btree/search.rs.html#106-124),
and panicked.
They also represent probable bugs,
as it's silly to write a query which statically will return no rows.

With this commit, we detect statically unsatisfiable bounds in two cases:
- When compiling queries, we log a message at `WARN` containing the offending query.
- When evaluating queries, we silently construct an `EmptyRelOps`
  rather than a real query iterator.

This commit also adds a test that the offending queries can be compiled and executed
without panicking, and select no rows.

* Per Joshua's review, add comments that this is a suboptimal solution

* Fix typo

---------

Co-authored-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
2024-04-09 01:23:20 +00:00
Mazdak Farrokhzad 344861f290 use nohasher_hash and ahash instead of siphash13 (#1040)
* use nohasher_hash and ahash instead of siphash13

* re-export types in spacetimedb_data_structures::map
2024-04-05 17:30:51 +00:00
Noa 99b2fd426f Bump to reqwest 0.12 (uses hyper 1.0) (#1031) 2024-04-03 01:54:53 +00:00
Noa 7d3bdc308b Prune deps from bindings dependency tree (#1014) 2024-04-03 01:54:36 +00:00
Kim Altintop 1d316d991e Commitlog: Add canonical txdata payload (#921)
Defines the canonical commitlog payload, and how to encode / decode it.

Also exposes folds alongside iterators, which allows the common case of
replaying the commitlog onto a database to be further optimized (the
`Txdata` does not have to be constructed in this case). This
optimization is, however, left for a future patch.
2024-04-02 09:54:19 +00:00
Kim Altintop 73cd78231e Commitlog: Add I/O based on regular files (#920)
Provides a commitlog backing store based on files, and defines the
exported `Commitlog` type which fixes the store to the file-based one.
2024-04-02 09:10:21 +00:00