Commit Graph

62 Commits

Author SHA1 Message Date
Phoebe Goldman e77b62f475 Also capture a snapshot every new commitlog segment (#3405)
# Description of Changes

We've run into a problem on Maincloud caused by a database that was
writing a relatively small number of very large transactions. This was
accruing many commitlog segments consuming hundreds of gigabytes of
disk, but had not ever taken a snapshot, or compressed or archived any
data, as the database had not progressed past one million transactions.

With this PR, we take a snapshot every time the commitlog segment
rotates. We still also snapshot every million transactions.

One BitCraft database we looked at had 2.5 million transactions per
commitlog segment, meaning that this change will not meaningfully affect
the frequency of snapshots. The offending Maincloud database, however,
had only 50 transactions per segment!

# API and ABI breaking changes

N/a

# Expected complexity level and risk

3: Hastily made changes to finnicky code across several crates.

# Testing

I am unsure how to test these changes.

- [ ] <!-- maybe a test you want to do -->
- [ ] <!-- maybe a test you want a reviewer to do, so they can check it
off when they're satisfied. -->
2025-10-15 15:18:15 +00:00
Noa 619b8ce021 Bump Rust to 1.90 (#3397)
# Description of Changes

Necessary for pulling in rolldown.

# API and ABI breaking changes

None

# Expected complexity level and risk

1, with the caveat that this updates the Rust version and therefore
touches all the code.

# Testing

- [ ] Just the automated testing
2025-10-09 20:41:25 +00:00
Zeke Foppa f6f0909ea4 Update all licenses (#3002)
# Description of Changes

We recently merged several repos together. This PR clarifies the license
terms for several subdirectories, as well as the relationship between
the licenses.

The licenses in our subdirectories have become symbolic links to
licenses in our toplevel `licenses` directory. For any particular
subdirectory's license file in the diff, you can click `... -> View
file` and then click on the text that says "Symbolic Link" on that page.
This will take you to the license file that it links to.

I have also updated the `tools/upgrade-version` script to update the
change date in the new `licenses/BSL.txt` file.

# API and ABI breaking changes

None.

# Expected complexity level and risk

1

# Testing

None. Only changes to license files.

---------

Co-authored-by: Zeke Foppa <bfops@users.noreply.github.com>
2025-08-12 18:20:58 +00:00
Kim Altintop 37c64c787b commitlog: Provide folding over a range of tx offsets (#3129)
Adds methods and free-standing functions to allow folds to stop at an
upper
bound, by passing a range instead of only a start offset.

# Expected complexity level and risk

1

# Testing
2025-08-08 11:55:27 +00:00
Kim Altintop 7709f3cf1e commitlog: Set up options for toml configuration (#2942) 2025-07-17 08:34:35 +00:00
Noa 742303ca49 Bump rust-toolchain to rust 1.88 (#2749)
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2025-07-15 17:39:41 +00:00
Viktor Szépe f6da9e1f5f Fix typos (#2812)
Signed-off-by: Viktor Szépe <viktor@szepe.net>
2025-06-04 16:33:32 +00:00
Kim Altintop a6bc0e59fd commitlog: Reduce log noise when offset index cannot be used (#2791) 2025-06-02 07:27:12 +00:00
Kim Altintop c98219529d commitlog: Fix index truncation test. (#2792) 2025-05-30 15:13:06 +00:00
Shubham Mishra f2a9657a72 Commitlog: handle empty offset index lookup (#2771) 2025-05-22 12:59:02 +00:00
Kim Altintop 1f4207de86 commitlog: Include latest commit offset in segment metadata (#2733) 2025-05-19 06:54:58 +00:00
Kim Altintop 3d1a91c25c Handle snapshot restore more robustly (#2735)
Signed-off-by: Kim Altintop <kim@eagain.io>
Signed-off-by: Shubham Mishra <shivam828787@gmail.com>
Co-authored-by: Shubham Mishra <shubham@clockworklabs.io>
2025-05-15 14:35:09 +00:00
Shubham Mishra 41c316c984 Commitlog stream range fix. (#2721) 2025-05-10 04:06:05 +00:00
Noa 483a9488e2 Update rand (#2568) 2025-04-11 17:39:41 +00:00
Mario Montoya 3fd78203c4 Compress the snapshot (#2034) 2025-04-11 15:18:17 +00:00
Shubham Mishra 76a52ca747 Use Offset Index on Meta extract (#2549) 2025-04-09 19:05:52 +00:00
Noa 2f6660e919 Add integration test for commitlog compression (#2538) 2025-04-08 17:10:31 +00:00
Kim Altintop d88a266c20 commitlog: Derive serde for Commit (#2535) 2025-04-02 11:16:11 +00:00
Noa d436b1f9b7 Followup to #2504 (#2534) 2025-03-31 23:52:56 +00:00
Noa a5212a5f75 Commitlog compression (#2504) 2025-03-31 22:00:52 +00:00
Kim Altintop 8dfab1c09d commitlog: Open stream writer with metadata (#2530) 2025-03-31 17:25:39 +00:00
Kim Altintop 5063bd8759 commitlog: Streaming (#2492) 2025-03-26 07:40:23 +00:00
Kim Altintop 434c28063f commitlog: Fix open flags for read-only offset index (#2468) 2025-03-19 12:24:30 +00:00
Mario Montoya f9f38543c8 Add readmes to all implementation crates specifying that they do no offer stable interfaces (#2320) 2025-03-06 19:50:17 +00:00
Shubham Mishra 7cb509c2e2 handle offset index empty (#2344) 2025-03-05 11:04:10 +00:00
Kim Altintop 8054999927 commitlog: Use fdatasync (#2338) 2025-03-04 15:20:14 +00:00
Noa 293aebaef9 Bump to Rust 1.84 (#2001) 2025-01-28 23:11:29 +00:00
Phoebe Goldman d171b44a89 Don't create indexes during bootstrapping; wait until after replay (#2161) 2025-01-23 19:41:39 +00:00
Kim Altintop c5f4c8bc5c commitlog: Make offset index usable externally (#2108) 2025-01-14 18:56:08 +00:00
Kim Altintop da0f83b6dd commitlog: Make memory segment behave like O_APPEND (#2072) 2024-12-20 11:28:19 +00:00
Kim Altintop a191055f56 commitlog: Fix offset index truncation (#2073) 2024-12-19 15:44:10 +00:00
Kim Altintop 31698618a8 commitlog: Provide segment_len method for segments (#2042) 2024-12-10 10:43:39 +00:00
Shubham Mishra f04d2817d0 create commitlog dir in fs::New (#2006) 2024-11-21 15:47:40 +00:00
Kim Altintop 125ab58388 commitlog: Fix set_epoch (#2005) 2024-11-21 13:34:10 +00:00
Noa 97bff92efb Optimize integrate_generated_columns (#1895) 2024-11-12 16:36:50 +00:00
Noa f136670420 Directory structure impl (#1879)
Co-authored-by: Jeffrey Dallatezza <jeffreydallatezza@gmail.com>
2024-11-12 04:24:43 +00:00
Kim Altintop e4fcb72432 commitlog: Small tweaks (#1978) 2024-11-11 13:24:21 +00:00
Kim Altintop f22b163c0a commitlog: Introduce epoch (#1851) 2024-11-05 10:10:30 +00:00
Kim Altintop d09e1eabd2 commitlog: Improve skipping behavior of traversals (#1902) 2024-10-28 08:07:40 +00:00
Kim Altintop 98d408b388 commitlog: Fix transactions iterator (#1884) 2024-10-22 16:40:51 +00:00
Kim Altintop afeb3421ae commitlog: Yield StoredCommit in iterators (#1791) 2024-10-08 08:53:25 +00:00
Shubham Mishra c5a9a0d9ca offset index read integration (#1779) 2024-10-07 12:12:50 +00:00
Kim Altintop ae17c6d57e commitlog: Return commit info (#1778) 2024-10-01 17:31:42 +00:00
Shubham Mishra a7e23210f6 Downlevel some info messages (#1739) 2024-09-25 05:54:21 +00:00
Shubham Mishra eeaa00a05f Commitlog offset index (#1671)
Signed-off-by: Shubham Mishra <shubham@clockworklabs.io>
Co-authored-by: Kim Altintop <kim@eagain.io>
2024-09-24 16:06:49 +00:00
Kim Altintop 0029ca5648 commitlog: Make commit module public, and allow access to header fields (#1685) 2024-09-10 08:16:32 +00:00
Kim Altintop 8338b53b8f commitlog: Fix single-commit bitflip test (#1528) 2024-07-19 05:57:53 +00:00
Jeremie Pelletier f91dcda283 Make some commitlog helpers public (#1390) 2024-07-09 18:02:58 +00:00
Kim Altintop ff851ae5fa commitlog: Make bitflip test a proptest (#1333)
* commitlog: Make bitflip test a proptest

The test sometimes fails. As a proptest, we'll be able to seed it with
failing inputs.

Fixes: #1167

* commitlog: Fix the bitflip test

Turns out we sometimes flipped a bit in the CRC32 itself, which makes
things go wrong in not the expected way.
2024-06-05 05:53:41 +00:00
Phoebe Goldman 18aa1d4299 Fix commitlog fold_transactions_from ignoring requested offset (#1330)
* Fix commitlog `fold_transactions_from` ignoring requested offset

Prior to this commit, `fold_transactions_from` on a durability backed by a commitlog
would discard the requested offset and unconditionally yield all txes in the relevant segments.

This commit changes that behavior so that `fold_transactions_from`
skips commitlog commits (which contain many txes) less than the reqested offset,
and skips txes using `consume_record`.

* Add `Decoder::skip_record`

Lucky I asked Kim whether I was using `consume_record` and `decode_record` correctly,
because I wasn't.

This commit adds methods to `Decoder` and `Visitor` for skipping records and rows,
causing them to be extracted from the reader but not folded.

* Fix test

Add new methods to `Decoder` and `Visitor` hidden away in a test I missed.
2024-06-03 22:37:43 +00:00