Commit Graph

24 Commits

Author SHA1 Message Date
Noa 742303ca49 Bump rust-toolchain to rust 1.88 (#2749)
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2025-07-15 17:39:41 +00:00
Kim Altintop a6bc0e59fd commitlog: Reduce log noise when offset index cannot be used (#2791) 2025-06-02 07:27:12 +00:00
Kim Altintop c98219529d commitlog: Fix index truncation test. (#2792) 2025-05-30 15:13:06 +00:00
Shubham Mishra f2a9657a72 Commitlog: handle empty offset index lookup (#2771) 2025-05-22 12:59:02 +00:00
Kim Altintop 1f4207de86 commitlog: Include latest commit offset in segment metadata (#2733) 2025-05-19 06:54:58 +00:00
Noa 483a9488e2 Update rand (#2568) 2025-04-11 17:39:41 +00:00
Shubham Mishra 76a52ca747 Use Offset Index on Meta extract (#2549) 2025-04-09 19:05:52 +00:00
Noa a5212a5f75 Commitlog compression (#2504) 2025-03-31 22:00:52 +00:00
Kim Altintop 5063bd8759 commitlog: Streaming (#2492) 2025-03-26 07:40:23 +00:00
Shubham Mishra 7cb509c2e2 handle offset index empty (#2344) 2025-03-05 11:04:10 +00:00
Kim Altintop 8054999927 commitlog: Use fdatasync (#2338) 2025-03-04 15:20:14 +00:00
Kim Altintop c5f4c8bc5c commitlog: Make offset index usable externally (#2108) 2025-01-14 18:56:08 +00:00
Kim Altintop a191055f56 commitlog: Fix offset index truncation (#2073) 2024-12-19 15:44:10 +00:00
Kim Altintop 125ab58388 commitlog: Fix set_epoch (#2005) 2024-11-21 13:34:10 +00:00
Kim Altintop f22b163c0a commitlog: Introduce epoch (#1851) 2024-11-05 10:10:30 +00:00
Kim Altintop d09e1eabd2 commitlog: Improve skipping behavior of traversals (#1902) 2024-10-28 08:07:40 +00:00
Shubham Mishra c5a9a0d9ca offset index read integration (#1779) 2024-10-07 12:12:50 +00:00
Kim Altintop ae17c6d57e commitlog: Return commit info (#1778) 2024-10-01 17:31:42 +00:00
Shubham Mishra a7e23210f6 Downlevel some info messages (#1739) 2024-09-25 05:54:21 +00:00
Shubham Mishra eeaa00a05f Commitlog offset index (#1671)
Signed-off-by: Shubham Mishra <shubham@clockworklabs.io>
Co-authored-by: Kim Altintop <kim@eagain.io>
2024-09-24 16:06:49 +00:00
Jeremie Pelletier f91dcda283 Make some commitlog helpers public (#1390) 2024-07-09 18:02:58 +00:00
Kim Altintop 2c3fc66f21 Commitlog: panic on fsync failure (#985)
* commitlog: Panic on fsync failure

Errors returned by `fsync(2)` are particularly nefarious, as it is
mostly undefined what the state of the page cache is in this case.

Since the log is synced asynchronously and not after every write, it is
impossible to know up to which commit data can be considered durable --
except by reading the most recent segment from disk.

Therefore, the reasonable thing to do is to prevent any further use of
the log, and force users to re-load it from disk.

Note that this is only half of the solution: an application restart may
still read data from the page cache, which could be gone after a system
restart.

To fix this, we would need to employ direct I/O (i.e. `O_DIRECT`), which
however is beyond the scope of this patch as it invalidates the use of
most of `std::io`.

* commitlog: Handle duplicate commits when iterating

We cannot exclude the possibility of a false failure in I/O operations.
In particular, `EIO` errors are difficult to attribute to a particular
write, as they happen asynchronously during flush of the page cache.

Because we do not bypass the page cache, the possibility exists that a
particular commit is lost when it isn't, or that it is considered
durable when it isn't. The former could lead to duplicate commits
appearing in the log, while the latter could lead to a matching offset
number, but with different commit payload.

This patch thus ignores duplicates, and introduces a new error variant
in the event the offset matches but the checksum doesn't.

* durability: Manage the flush-and-sync task in this crate

Since syncing the commitlog may now panic, it is more obvious to handle
all async tasks here, so as to be able to handle the panic cases.

Namely, if the `FlushAndSyncTask` panics, the `PersisterTask` is
aborted. This will lead to the channel receiver being dropped, which in
turn will cause the next `append_tx` call to panic.

* commitlog: Remove async flush-and-sync

Due to panic behaviour, it is now preferable to manage periodic sync at
the use site of the commitlog crate.

Hence remove `flush_and_sync_every` method, and with it the dependency
on tokio.
2024-05-28 18:22:38 +00:00
Kim Altintop 1d316d991e Commitlog: Add canonical txdata payload (#921)
Defines the canonical commitlog payload, and how to encode / decode it.

Also exposes folds alongside iterators, which allows the common case of
replaying the commitlog onto a database to be further optimized (the
`Txdata` does not have to be constructed in this case). This
optimization is, however, left for a future patch.
2024-04-02 09:54:19 +00:00
Kim Altintop 3b343e4eb1 Commitlog: Base implementation "sans I/O" (#919)
First in a series of patches to implement the new commitlog format.

This patch implements the base format, leaving the transaction payload
generic. Segment handling, writing and reading is implemented based on
an in-memory backend, which greatly simplifies testing.

As a notable deviation from the previous implementation, segments are
never implicitly trimmed. Instead, faulty commits are ignored if and
only if the next commit in the log sequence is valid and has the right
offset. On the write path, this entails closing the active segment when
an (I/O) error occurs, but retaining the commit in memory such that it
is written to the next segment.

Note that this patch does not define the final public API.
2024-04-02 06:18:30 +00:00