7 Commits

Author SHA1 Message Date
Kim Altintop 17cc15ef4c Append commit instead of individual transactions to commitlog (#4404)
Re-open #4140 (reverted in #4292).


The original patch was merged a bit too eagerly.
It should go in _after_ 2.0 is released with some confidence.
2026-03-03 15:08:05 +00:00
Noa e3582131fe Migrate to Rust 2024 (#3802)
# Description of Changes

It'd be best to review this commit-by-commit, and using
[difftastic](https://difftastic.wilfred.me.uk) to easily tell when
changes are minor in terms of syntax but a line based diff doesn't show
that.

# Expected complexity level and risk

3 - edition2024 does bring changes to drop order, which could cause
issues with locks, but I looked through [all of the warnings that
weren't fixed
automatically](https://gistcdn.githack.com/coolreader18/80485ae5c5f82de1784229cce2febb26/raw/ba80f3fecda66ceb34f4f7ad73b98ea02d4893a2/warnings.html)
and couldn't find any issues.

# Testing

n/a; internal code change
2026-03-03 11:06:52 +00:00
clockwork-labs-bot bad5335114 Revert "Append commit instead of individual transactions to commitlog (#4140)" (#4292)
Reverts #4140 per @kim's request — was not ready to merge yet.

Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
2026-02-13 10:23:24 -05:00
Kim Altintop c4c3bf78b3 Append commit instead of individual transactions to commitlog (#4140)
Changes the commitlog (and durability) write API, such that the caller
decides how many transactions are in a single commit, and has to supply
the transaction offsets.

This simplifies commitlog-side buffering logic to essentially a
`BufWriter` (which, of course, we must not forget to flush). This will
help throughput, but offers less opportunity to retry failed writes.
This is probably a good thing, as disks can fail in erratic ways, and we
should rather crash and re-verify the commitlog (suffix) than continue
writing.

To that end, this patch liberally raises panics when there is a chance
that internal state could be "poisoned" by partial writes, which may be
debatable.


# Motivation

The main motivation is to avoid maintaining the transaction offset in
two places in such a way that they could diverge. As ordering commits is
the responsibility of the datastore, we make it authoritative on this
matter -- the commitlog will still check that offsets are contiguous,
and refuse to commit if that's not the case.

A secondary, related motivation is the following:

A "commit" is an atomic unit of storage, meaning that a torn (partial)
write of a commit will render the entire commit corrupt. There hasn't
been a compelling case where we would want this, and have always
configured the server to write exactly one transaction per commit.
The code to handle buffering of transactions is, however, rather
complex, as it tries hard to allow the caller to retry writes at commit
boundaries. An unfortunate consequence of this is that we'd flush to the
OS very often, leaving throughput performance on the table.

So, if there is a compelling case for batching multiple transactions in
a commit, it should be the datastore's responsibility.


# API and ABI breaking changes

Breaks internal APIs only.

# Expected complexity level and risk

5 - Mostly for the risk

# Testing

Existing tests.
2026-02-13 13:10:30 +00:00
Phoebe Goldman e77b62f475 Also capture a snapshot every new commitlog segment (#3405)
# Description of Changes

We've run into a problem on Maincloud caused by a database that was
writing a relatively small number of very large transactions. This was
accruing many commitlog segments consuming hundreds of gigabytes of
disk, but had not ever taken a snapshot, or compressed or archived any
data, as the database had not progressed past one million transactions.

With this PR, we take a snapshot every time the commitlog segment
rotates. We still also snapshot every million transactions.

One BitCraft database we looked at had 2.5 million transactions per
commitlog segment, meaning that this change will not meaningfully affect
the frequency of snapshots. The offending Maincloud database, however,
had only 50 transactions per segment!

# API and ABI breaking changes

N/a

# Expected complexity level and risk

3: Hastily made changes to finnicky code across several crates.

# Testing

I am unsure how to test these changes.

- [ ] <!-- maybe a test you want to do -->
- [ ] <!-- maybe a test you want a reviewer to do, so they can check it
off when they're satisfied. -->
2025-10-15 15:18:15 +00:00
Noa a5212a5f75 Commitlog compression (#2504) 2025-03-31 22:00:52 +00:00
Kim Altintop 5063bd8759 commitlog: Streaming (#2492) 2025-03-26 07:40:23 +00:00