Commit Graph

3186 Commits

Author SHA1 Message Date
Mazdak Farrokhzad b6c0e1c4d8 Add AlgebraicValue::take + move test-code in btree_index to tests (#1028)
* add AlgebraicValue::take for a neater interface

* btree_index: move test-only code to tests
2024-03-27 17:05:19 +00:00
joshua-spacetime 99bd7ac591 perf(1024): Remove serialization from tx execution thread (#1027)
Closes #1024.

Before this change,
we would serialize messages **before** inserting into the send queue.

Because we commit the tx only after inserting into the send queue,
this meant we were holding onto the database lock unnecessarily.

After this change,
we serialize messages **after** inserting into the send queue.
This means we serialize only after committing the tx.
2024-03-27 16:06:10 +00:00
Mazdak Farrokhzad 9141a42622 Bump Rust to 1.77 + fix warnings + use Bound::map (#1020)
* bump Rust to 1.77 + fix warnings + use Bound::map

* use .truncate(true) for OpenOptions
2024-03-25 20:27:08 +00:00
Phoebe Goldman ba8a8d93c3 BFLATN -> BSATN fast-path for fixed-length rows (#1005)
* Implement (but do not use) a fast path for BFLATN -> BSATN conversion

* fmt and clippy

* `u16` offset rather than `usize`

* Address Joshua's review

* Define methods on `RowRef` and `RelValue` which use the new serializer

* Comment in `align_to` about div-by-zero

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>

* Add benchmark comparing BFLATN -> BSATN with and without the fast path

* Add benchmark on `u64_u64_u32`, which has less interior padding than `u32_u64_u64`

* Remove `to_len` from `to_bsatn_extend`

It turns out to be slower than just eating the `realloc`s.

* Remove unused `to_bsatn_slice`

I thought I would need it, but it ended up not being useful.

* Expand comment with example; `Box<[...]>` to reduce memory footprint

* Comments from Mazdak's review

---------

Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-03-25 19:46:10 +00:00
Tyler Cloutier af73b29b5f Made me the codeowner of the traits.rs file for the datastore 2024-03-21 21:19:31 -07:00
joshua-spacetime 47e787877f test(1099): Multi-column index selection through query macro (#1001) 2024-03-21 23:33:13 +00:00
joshua-spacetime fb932b603d fix(1009): Multi-column index selection for query macro (#1012)
Fixes #1009.

Looking up a positional FieldName in a Header was broken.
2024-03-21 22:53:26 +00:00
Mario Montoya ffc3caedeb Show the error text of the server when a sql call fails on cli (#1004) 2024-03-21 20:23:51 +00:00
joshua-spacetime b0b7b58982 fix: Consistency test for subscription message ordering (#1007)
Updates a test to wake up a writer tx only after a reader tx has started.
2024-03-21 19:15:51 +00:00
joshua-spacetime b36dda419e fix(996): Do not release database lock early for subscriptions (#997)
Fixes #996.
2024-03-20 17:21:39 +00:00
joshua-spacetime 8d83b00e27 test(996): Subscriptions should not drop read lock early (#995)
If a subscription drops its read lock on the database too early,
that is before it sends its updates to the client,
this test will fail.
2024-03-20 16:40:02 +00:00
Mazdak Farrokhzad 5cbdcaecbb incr-join, find_updates: avoid unncecessary clones & use partition (#988)
* incr-join, find_updates: avoid unncecessary clones & use partition

* JoinSide: store 'Vec<PV>'s instead

* address joshua & phoebe's reviews
2024-03-19 17:21:12 +00:00
Phoebe Goldman a21b1bc3a9 Nuke to_mem_table_with_op_type (#990)
* Nuke `to_mem_table_with_op_type`

Rather than annotating rows with `__op_type` during `eval_incr` of selects,
partition the rows before evaluation, then merge after.

* Add historical comment.

Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>

* Remove `_replaced_source_id`

---------

Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org>
Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>
2024-03-18 21:09:35 +00:00
Mario Montoya e270d97012 Remove useless usages of RowCount (#987) 2024-03-18 19:45:49 +00:00
Phoebe Goldman 02aeac7fdc Don't do alignment computations during BFLATN ser/de (#986)
`AlgebraicTypeLayout` and friends already include full layout information,
including properly-aligned offsets for `ProductTypeElementLayout`s.
As such, there's no need to do any alignment computation
during `serialize_value` or `write_value`.

Instead, while traversing a `ProductTypeLayout`,
we can use each element's `offset` to update the `curr_offset`.
2024-03-18 14:25:59 +00:00
Mazdak Farrokhzad 3f425a9483 fix: Optimize query plan in iter_filtered_chunks (#939)
Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-03-16 07:21:49 +00:00
Mazdak Farrokhzad da1baaa5fd perf(933): Clone bsatn instead of product values in incremental update (#951)
* eval_updates: use map entry apis

* dedup logic in remove_subscription + use entry api to hash only once

* stop cloning PVs in eval_updates

* address Phoebe's comments

* add tracing for perf testing

---------

Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-03-16 02:08:02 +00:00
Mazdak Farrokhzad 755457a111 perf(813): Avoid materialization of product values in subscriptions (#959)
Closes #813.

A subscription will no longer materialize product values,
for queries with read-only row operations.
but instead it will serialize from bflatn straight to bsatn.

Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-03-15 22:38:47 +00:00
Mazdak Farrokhzad 5601c18c52 perf: don't clone QueryExpr (#981)
Co-authored-by: joshua-spacetime <josh@clockworklabs.io>
2024-03-15 19:55:35 +00:00
Phoebe Goldman 96e5ef16f0 Distinguish between inner and semijoins in QueryExpr AST. (#969)
* Distinguish between inner and semijoins in `QueryExpr` AST.

This commit adds a flag `semi: bool` to `JoinExpr`, which signifies a semijoin,
as opposed to an inner join.

A new optimization pass, `QueryExpr::try_semi_join`, is defined
which can detect a certain common case of inner joins and rewrite them into semijoins.

The punchline here is that `core::vm::join_inner` used to accept a flag `semi: bool`
which it could use to avoid some expensive `Header` mutations,
but that flag was always passed as `false` because we had no way to distinguish semijoins.
With this commit, the flag is actually used,
so evaluating non-indexed semijoins should avoid allocating a new `Header`.

* Address Joshua's review

- Remove a test that was silly and backwards, and intentionally thwarted the optimizer
  in a way that will hopefully stop working soon.
- Add a test that an `IncrementalJoin`'s `virtual_plan` looks like we expect.
- Rename the `JoinExpr` argument to `core::vm::join_inner` for clarity.
- Sprinkle comments around about how we compile and optimize joins.
2024-03-15 14:09:52 +00:00
Phoebe Goldman 617b2a8ab3 Log a warning when doing iter_by_col_range without an index (#971)
* Log a warning when doing `iter_by_col_range` without an index

* Only warn if the table is sufficiently large for a scan to be bad

Per Tyler's review, this commit gates the warning behind `rdb_num_table_rows`,
so that the warning is only printed
if the table in question has at least `TOO_MANY_ROWS_FOR_SCAN` rows.

`TOO_MANY_ROWS_FOR_SCAN` is defined as 1000
because that's the number Tyler said in his comment.

* Gate the unindexed warning behind a feature in `core`
2024-03-15 13:50:30 +00:00
Kim Altintop e9db89e47f core: Fix schema checks in database updates, again (#974)
It turns out that the changes introduced in #734 do not result in more
reliable detection of incompatible schema updates. This is because the
datastructures involved can be converted into each other, but that
conversion is not bijective.

Fix this by manually adjusting the schema of the existing table to be
comparable to the proposed table.

Also log details about a schema mismatch to the user-retrievable database log,
in unified diff format.
2024-03-15 08:55:40 +00:00
Mazdak Farrokhzad 261373c1d8 Remove #[tracing::instrument...] from InstanceEnv::*_by_col_eq (#983)
* remove tracing from InstanceEnv::*_by_col_eq

* Added #[tracing::instrument(skip_all)] to call_reducer_with_tx and call_reducer

* remove tracing on some stuff in wasm_intance_env

---------

Co-authored-by: Tyler Cloutier <cloutiertyler@aol.com>
2024-03-15 01:21:14 +00:00
Mazdak Farrokhzad 9313283517 fix BufReader for Cursor impl (#982) 2024-03-15 01:06:19 +00:00
Zeke Foppa 1b391b4edc Misc tweaks to tools/perf.sh (#937)
* [perf-tweaks]: perf tweaks

* [bfops/perf-tweaks]: more tweaks

* [bfops/perf-tweaks]: more tweaks

* [bfops/perf-tweaks]: tweaks

* [bfops/perf-tweaks]:

* [bfops/perf-tweaks]: review

* [bfops/perf-tweaks]: review
2024-03-14 23:24:54 +00:00
Zeke Foppa c1dfd8fddc Add a Testing section to the PR template (#898)
* [bfops/testing-in-template]: add testing to PR template

* Add Phoebe's suggestion

Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>

---------

Signed-off-by: Zeke Foppa <196249+bfops@users.noreply.github.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
2024-03-14 22:43:28 +00:00
Mazdak Farrokhzad 96b4f099b0 build_query, IndexScan, mem table: fix bug, avoid ColumnOp (#980) 2024-03-14 21:21:03 +00:00
Zeke Foppa 9c82e02ac8 spacetime publish: Add --wasm-file flag (#883)
* [bfops/publish-wasm]: commit

* [bfops/publish-wasm]: tweaks

* [bfops/publish-wasm]: rename

* [bfops/publish-wasm]: uncommitted changes

* [bfops/publish-wasm]: updates

* [bfops/publish-wasm]: fix

* [bfops/publish-wasm]: review

* [bfops/publish-wasm]: review
2024-03-14 21:08:53 +00:00
Mazdak Farrokhzad a552697550 simplify RelOps<'a> for CatalogCursor<I> (#979) 2024-03-14 20:17:54 +00:00
Mazdak Farrokhzad c2afe7d83a iter_filtered_chunks: avoid PVs (#978) 2024-03-14 20:09:01 +00:00
Mazdak Farrokhzad a05aed4e9a Code motion: QueryCode => QueryExpr; CrudCode => CrudExpr (#975)
* QueryCode.{table -> source}

* nix QueryCode; identical to QueryExpr

* nix CrudCode; use identical CrudExpr instead
2024-03-14 18:16:04 +00:00
Noa 540c519002 Rewrite smoketests as python unittests (#692)
* Rewrite smoketests as python unittests

* Get all tests working and do some work on parallel unittest

* Give up on parallel unittests

* Fix CI + address comments

* Fix skip-clippy arg confusion (just use the env var)

* fix ci

* Add comments
2024-03-14 02:47:38 +00:00
Mario Montoya f1226a056c Fix bench for location, restore single-column indexes (#967)
* Fix bench for location, restore single-column indexes

* Typo
2024-03-13 16:39:00 +00:00
joshua-spacetime 1066964f17 perf(816): Compile inner joins ahead of time for incremental evaluation (#964)
Joins of two delta tables are compiled to an inner join.
Their ahead of time compilation was not handled as part of #938.
2024-03-13 15:00:23 +00:00
joshua-spacetime af4a1a6425 test(954): Incremental evaluation for index joins (#958)
Adds a test for a tx that generates inserts and deletes for both tables of the join.
2024-03-12 23:14:16 +00:00
joshua-spacetime a563eb22d3 fix: Incremental evaluation for index joins (#957)
Closes #954.

I previously avoided the evaluation of certain delta table joins as an optimization,
which relied on the fact that a tx would not include inserts and deletes for both tables,
which of course is not generally correct.

This patch includes the fully general solution.
2024-03-12 22:14:35 +00:00
Phoebe Goldman 8994418136 Increase sdk tests' timeout to 60 seconds (#956)
We're seeing intermittent failures of the SDK tests in CI due to timeouts.
The timeout is meant only to fail if an expected event never happens;
the SDK tests are not interested in measuring performance at all.

This commit doubles the timeout from 30 to 60 seconds,
in the hopes that we will see fewer false failures.
2024-03-12 15:06:36 +00:00
HSReina f02c3bd2ad Fix issue where some Typescript based project transpiler might mangle the name (#35)
Co-authored-by: Gérald Divoux <gerald.divoux@ninsight.io>
2024-03-12 16:02:51 +01:00
Piotr Sarnacki 5a39592bdc Bump to 0.8.2 2024-03-12 15:35:31 +01:00
Nathaniel Richards bc5f4c93c8 Cleaned up code and added better error handling and responses for invalid tables (#32)
* Cleaned up code and added better error handling and responses for invalid tables

* Fixed breaking test

* Implemented new logging system

* made a small change to a WS error log
2024-03-12 15:23:58 +01:00
joshua-spacetime ba5ad054af refactor: Incremental evaluation tests for index joins (#952)
No functionality has been modified.
This patch just makes incremental eval test cases clearer and easier to extend.
2024-03-12 14:16:06 +00:00
Noa 987d742baa Also poll handle_queue while flushing (#947)
* Also poll handle_queue while flushing

* Apply suggestions from code review

Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
Signed-off-by: Noa <coolreader18@gmail.com>

---------

Signed-off-by: Noa <coolreader18@gmail.com>
Co-authored-by: Phoebe Goldman <phoebe@clockworklabs.io>
2024-03-08 21:09:30 +00:00
Mario Montoya 891f6b8931 Truly remove perfcnt (#946) 2024-03-08 20:26:30 +00:00
joshua-spacetime 6098bab296 perf: Update magic constant for join rewrite (#944) 2024-03-08 02:03:17 +00:00
joshua-spacetime 6a2af38c55 chore: Remove instrumentation from subscription eval hot path (#945) 2024-03-08 01:33:39 +00:00
joshua-spacetime 0af87d2306 perf: Make op_type special case fast for selections (#943)
As an optimization,
because we do not support strict projections in subscription queries,
selections may always assume that the op_type field is last,
when removing it.
2024-03-07 21:50:24 +00:00
james gilles 1611d10713 Remove perfcnt for now (#941) 2024-03-07 21:16:58 +00:00
joshua-spacetime bba5892f26 chore: Header does not need to be Ord (#942) 2024-03-07 20:58:27 +00:00
Phoebe Goldman 8d5b33e35d Incremental joins: compile once, run repeatedly. (#938)
* Incremental joins: compile once, run repeatedly.

Well, more like, compile 3 times, run repeatedly, but 3 is approximately 1.

This commit re-writes `IncrementalJoin` to be a re-usable representation
of a query plan for an incremental join,
where before it was a one-off worker for the same.

`IncrementalJoin` stores three copies of the query, compiled for `MemTable * DbTable`,
`DbTable * MemTable` and `MemTable * MemTable`.

Related to this change, `eval_incr` no longer needs an `AuthCtx`,
because we check permissions during query compilation, not execution,
and all query planning is now done ahead of time during `add_subscriber`.
As a result, many callsites, especially in tests, which used to pass an `AuthCtx`
no longer do so.

* `IncrementalJoin`: save `MemTable` headers to avoid recomputing

* Don't include `__op_type` column in incremental joins

Incremental joins never used the `__op_type` column,
as they separate deletes from inserts and eval them as separate queries.

This commit causes incremental joins to no longer include the `__op_type` column
in their `MemTable`s at all,
which simplifies the code and should remove some allocations.

* Remove `pub` on test `run_query` helper.

The module's not `pub`, so it was a pointless qualifier.
2024-03-07 17:47:49 +00:00
joshua-spacetime 30c7338682 chore: Remove debug instrumentation from hot path (#936)
Removes most instrumentation from RelationalDB,
but keeps all pre-existing instrumentation of the wasm api.
2024-03-07 01:23:01 +00:00