SpacetimeDB

mirror of https://github.com/clockworklabs/SpacetimeDB.git synced 2026-05-13 03:08:40 -04:00

Author	SHA1	Message	Date
Kim Altintop	05d4874918	Create `db.lock` file only for persistent databases (#3912 ) This is the first step to make in-memory only databases not touch the disk at all. Pending is an in-memory only sink for module logs. Responsibility for the lock file is transferred to `Durability`, which means that only persistent databases opened for writing acquire the lock. As a consequence, the `Durability` trait gains a `close` method that prevents further writes and drains the internal buffers, even when multiple `Arc`-pointers to the `Durability` exist. # Expected complexity level and risk 2 # Testing Covered by existing tests.	2026-01-08 08:22:37 +00:00
Kim Altintop	e2b4113ffb	Async shutdown for database / durability (#3880 ) Controlled shutdown of a database should drain the outstanding transactions queue(s) and flush them to the durability layer. With the introduction of another queueing layer in #3868, it became harder to observe when or if this process is completed. This patch thus introduces an explicit (async) shutdown method for `RelationalDB` and below, which will wait until all submitted transactions are either reported durable, or an error occurs in the durability layer. `RelationalDB` is made `!Clone`, such that shutdown can be initiated in the `Drop` impl. Note that this requires access to a tokio runtime, which we thread through via the `Persistence` services in order to allow control over which of the various runtimes is being used for durability-related tasks. Also moves `RelationalDB::open` to a blocking thread when a persistence-enabled database is constructed by the `HostController` -- this process performs heavy I/O and can take a substantial amount of time, during which we don't want to block a worker thread. # API and ABI breaking changes None # Expected complexity level and risk 3 # Testing - [ ] some testing added - [ ] existing tests still pass - [ ] `impl Drop for RelationalDB` difficult to test, extra eyeballs needed --------- Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2025-12-17 18:28:42 +00:00
Kim Altintop	cfd0d4b712	commitlog,durability: Support preallocation of disk space (#3437 ) When a new commitlog segment is created, allocate disk space for it up to the maximum segment size. Also do this when resuming writes to an existing segment, such that segments created without preallocation will allocate as well when the database is opened. Preallocation is gated behind the feature "fallocate", because it is not always desirable to preallocate, e.g. for local `standalone` users. The feature can only be enabled on Linux targets, because allocation is done using the Linux-specific `fallocate(2)` system call. Unlike `ftruncate(2)` or the portable `posix_fallocate(3)`, `fallocate(2)` supports allocating disk space without zeroing. This is currently required, because the commitlog format does not handle padding bytes. If not enough space can be allocated, the commitlog refuses writes. For commitlogs that were created without preallocation, this means that the commitlog cannot even be opened in this situation. The local durability impl will crash if it detects that the commitlog is unable to allocate enough space. This means that a database will eventually crash and be unable to start in an out-of-space situation. Allocated space is not included in the reported size of the commitlog. Instead, allocated blocks are reported separately. # Expected complexity level and risk 3 - Disk size monitoring may need to be adjusted. # Testing - [x] Adds a test that demonstrates the crash behavior of [`spacetimedb_durability::Local`] when there is insufficient space. The test performs I/O against a loop device. - [x] Modified the `repo::Memory` impl so that it can run out of space. No test currently utilizes this, but existing tests assuming infinite space still pass.	2025-11-10 16:55:55 +00:00
Shubham Mishra	9b87337ba8	split create snapshot method (#3344 ) # Description of Changes Split `create_snaphot` method to expose `write_snapshot_file`. Required for https://github.com/clockworklabs/SpacetimeDB/pull/3344 # Expected complexity level and risk 1 # Testing Exising tests	2025-10-03 11:44:46 +00:00
Kim Altintop	e0bfe00136	core: Allow snapshot worker to be reused (#3331 ) Make it so the `SnapshotWorker` can be re-configured with a new committed state. This allows event subscriptions to remain valid while a replica transitions from leader to follower and vice versa. This is considerably simpler than keeping the lifetimes of database and persistence services strictly in-sync, at the expense of an idle task per replica. # Expected complexity level and risk 1.5 --------- Signed-off-by: Kim Altintop <kim@eagain.io> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2025-10-02 22:14:27 +00:00
Kim Altintop	ffaf79100c	Improve snapshot compression + metrics (#3296 ) Report more metrics about snapshot compression, namely: - time to compress a single snapshot (histogram) - for each compression pass: - number of snapshots found to be already compressed (gauge) - number of snapshots compressed (gauge) - cumulative number of objects compressed (gauge) - cumulative number of objects hardlinked (gauge) Those metrics are collected from the `spacetimedb-snapshot` crate without imposing a prometheus dependency on it, i.e. they can be observed by the caller as ordinary Rust types. This is exploited to avoid scanning the entire snapshot repository on each pass -- only the range `(last_compressed + 1)..newest_snapshot` is visited (note that the `compress_snapshots` method now short-circuits on errors). Lastly, the snapshot worker can now be configured to disable compression. This greatly simplifies implementation of alternative post-processing strategies, e.g. involving archival, for which a more coarse-grained compression strategy may be more appropriate. Subcribers are notified of a new snapshot _after_ compression, such that any filesystem locks should be released. # Expected complexity level and risk 2 # Testing May need some, I'm pondering.	2025-10-01 11:46:48 +00:00
Kim Altintop	a675cb36d2	Expand scope of `DurabilityProvider` to include snapshotting (#3295 ) The `DurabilityProvider` trait was introduced to enable the `HostController` to procure an alternative `Durability` impl from an external source. It is also useful to be able to instantiate a `SnapshotWorker` externally, in order to subscribe to snapshot creation events without access to the `RelationalDB` instance it is operating on. At a later stage, we may also use it to control the snapshot frequency externally. This patch thus reframes the trait as `PersistenceProvider`, whose job is to provide persistence-related services. Also separates snapshot creation and compression of older snapshots, and adds instrumentation to gather timing information for both. # Description of Changes Re-submit of #3281 (reverted by #3293), with only the intended changes. # Expected complexity level and risk 1.5 # Testing No functional changes.	2025-10-01 05:50:37 +00:00
Kim Altintop	311462760a	Revert "Expand scope of `DurabilityProvider` to include snapshotting (#3281 )" (#3293 ) This reverts commit `2b61190d4d`. An accident happened, and the patch contains changes that were intended for a separate PR. Perhaps better to start over.	2025-09-25 14:30:59 +00:00
Kim Altintop	2b61190d4d	Expand scope of `DurabilityProvider` to include snapshotting (#3281 ) The `DurabilityProvider` trait was introduced to enable the `HostController` to procure an alternative `Durability` impl from an external source. It is also useful to be able to instantiate a `SnapshotWorker` externally, in order to subscribe to snapshot creation events without access to the `RelationalDB` instance it is operating on. At a later stage, we may also use it to control the snapshot frequency externally. This patch thus reframes the trait as `PersistenceProvider`, whose job is to provide persistence-related services. Also separates snapshot creation and compression of older snapshots, and adds instrumentation to gather timing information for both. # Expected complexity level and risk 1.5 # Testing Not a functional change, existing tests should cover that.	2025-09-25 12:53:53 +00:00
joshua-spacetime	953ea94e57	utilities for archiving snapshots (#3224 ) # Description of Changes Adds utilities for marking and deleting snapshot directories that have been archived # API and ABI breaking changes <!-- If this is an API or ABI breaking change, please apply the corresponding GitHub label. --> None # Expected complexity level and risk <!-- How complicated do you think these changes are? Grade on a scale from 1 to 5, where 1 is a trivial change, and 5 is a deep-reaching and complex change. This complexity rating applies not only to the complexity apparent in the diff, but also to its interactions with existing and future code. If you answered more than a 2, explain what is complex about the PR, and what other components it interacts with in potentially concerning ways. --> 1 # Testing <!-- Describe any testing you've done, and any testing you'd like your reviewers to do, so that you're confident that all the changes work as expected! --> Testing will be handled by the patch that adds archival	2025-09-04 22:13:46 +00:00
Zeke Foppa	f6f0909ea4	Update all licenses (#3002 ) # Description of Changes We recently merged several repos together. This PR clarifies the license terms for several subdirectories, as well as the relationship between the licenses. The licenses in our subdirectories have become symbolic links to licenses in our toplevel `licenses` directory. For any particular subdirectory's license file in the diff, you can click `... -> View file` and then click on the text that says "Symbolic Link" on that page. This will take you to the license file that it links to. I have also updated the `tools/upgrade-version` script to update the change date in the new `licenses/BSL.txt` file. # API and ABI breaking changes None. # Expected complexity level and risk 1 # Testing None. Only changes to license files. --------- Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2025-08-12 18:20:58 +00:00
Phoebe Goldman	dd7888fedd	Drop log for already-compressed snapshot to debug (#3131 ) # Description of Changes On databases with many already-compressed snapshots, this was leading to log spam without providing any useful information. # API and ABI breaking changes N/a # Expected complexity level and risk 1 # Testing N/a	2025-08-08 15:33:10 +00:00
Noa	742303ca49	Bump rust-toolchain to rust 1.88 (#2749 ) Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2025-07-15 17:39:41 +00:00
Tyler Cloutier	20b087c248	Split datastore into its own crate (#2933 )	2025-07-12 21:41:00 +00:00
joshua-spacetime	c8716106ff	Record transaction metrics off the main thread (#2910 )	2025-07-01 15:51:05 +00:00
Mazdak Farrokhzad	0c3635188d	Auto-migrate: Allow adding new variants at the tail (#2874 )	2025-06-27 17:29:31 +00:00
Viktor Szépe	f6da9e1f5f	Fix typos (#2812 ) Signed-off-by: Viktor Szépe <viktor@szepe.net>	2025-06-04 16:33:32 +00:00
Kim Altintop	3d1a91c25c	Handle snapshot restore more robustly (#2735 ) Signed-off-by: Kim Altintop <kim@eagain.io> Signed-off-by: Shubham Mishra <shivam828787@gmail.com> Co-authored-by: Shubham Mishra <shubham@clockworklabs.io>	2025-05-15 14:35:09 +00:00
Mazdak Farrokhzad	373e47db39	`PagePool::{default -> new_for_test}` + temporary hack for `IN_MEMORY_CONFIG` / `test_index_scans` (#2707 )	2025-05-12 13:15:07 +00:00
Kim Altintop	763acb7bfd	snapshot: Improve memory utilization of snapshot fetcher (#2715 ) Signed-off-by: Kim Altintop <kim@eagain.io> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2025-05-09 07:12:42 +00:00
Kim Altintop	e32fc4af9c	snapshot: Provide streaming snapshot verification. (#2691 )	2025-05-08 15:59:58 +00:00
Mazdak Farrokhzad	eb589728c6	Allocate pages using a mult-tenant lock-free pool (#2587 )	2025-04-28 17:35:19 +00:00
Zeke Foppa	118e59de14	CI - Do some basic checks that crates are publishable (#2660 ) Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>	2025-04-23 17:08:43 +00:00
Mario Montoya	3fd78203c4	Compress the snapshot (#2034 )	2025-04-11 15:18:17 +00:00
Kim Altintop	85c347cba3	snapshot: Remote synchronization (#2559 )	2025-04-08 07:57:52 +00:00
Mario Montoya	f9f38543c8	Add readmes to all implementation crates specifying that they do no offer stable interfaces (#2320 )	2025-03-06 19:50:17 +00:00
Phoebe Goldman	aedc601145	Rename `Address` to `ConnectionId` (#2220 ) Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> Co-authored-by: James Gilles <jameshgilles@gmail.com> Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>	2025-02-10 00:40:16 +00:00
Kim Altintop	ea05e108d7	snapshot: Invalidate newer snapshots on creation (#2143 )	2025-01-22 20:36:05 +00:00
Noa	f136670420	Directory structure impl (#1879 ) Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>	2024-11-12 04:24:43 +00:00
Tyler Cloutier	83fc5c33d4	The banishment of Address (#1880 ) Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>	2024-10-23 01:56:20 +00:00
Tyler Cloutier	d6bb05b072	Renamed database instance to replica (#1806 )	2024-10-05 18:53:20 +00:00
Kim Altintop	d5173d3d50	snapshot: More `Snapshot` accessors (#1721 )	2024-09-25 05:43:50 +00:00
Kim Altintop	35de1ef920	snapshot: pub access to snapshot file + object repo (#1709 ) Signed-off-by: Kim Altintop <kim@eagain.io> Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>	2024-09-16 19:34:09 +00:00
james gilles	9e178bd772	Placate some clippy errors in `snapshot` (#1611 )	2024-08-20 16:44:19 +00:00
Phoebe Goldman	6c45e76a98	Integrate snapshotting into core (#1344 )	2024-06-11 12:40:02 +00:00
Phoebe Goldman	8c5f40db8d	Add the `snapshot` crate, which implements snapshotting at a low level (#1340 ) * Add the `snapshot` crate, which implements snapshotting at a low level - Requires making `BlobHash` be `Serialize` and `Deserialize`. For arcane macro-ology reasons, this requires writing `BlobHash::SIZE` instead of `Self::SIZE` (it gets embedded in a visitor struct or something). - Requires adding two new operators to `BlobStore`. - Adds a return value to `Page::save_content_hash`, for convenience. - Impls `DerefMut` for `Pages`. - Scary change: adds `Table::pages_mut`. I think possibly this operator should be `unsafe`, since write access to the `Pages` allows an undisciplined caller to violate the `Table`'s assumptions by corrupting a `Page`. It seems like an anti-pattern to mark a method `unsafe` on the grounds that misusing its return value can cause UB, but I don't see a plausible alternative without making most methods on `Page` unsafe. Open to feedback on this one! * Nix `Table::pages_mut` * Address Mazdak's feedback * Use `thiserror` rather than `anyhow` for better error hygiene	2024-06-05 21:58:12 +00:00

36 Commits