SpacetimeDB

mirror of https://github.com/clockworklabs/SpacetimeDB.git synced 2026-05-11 10:29:21 -04:00

Author	SHA1	Message	Date
Noa	c6c0ba3051	Kick clients that are backing up their message channel (#930 ) * Wrap ClientConnectionSender in an Arc everywhere * Kick clients that are backing up their message channel * Set tcp nodelay on incoming sockets	2024-03-06 22:39:17 +00:00
Phoebe Goldman	43eafda15b	Avoid `Header::find_pos_by_name` in `eval_incr`. (#935 ) Prior to this commit, `to_mem_table_with_op_type` would call `Header::find_pos_by_name` to locate the `__op_type` column, if it existed. This was slow, as `Header::find_pos_by_name` is a linear scan in increasing order, and the `__op_type` column is always either last or not present, so `to_mem_table_with_op_type` would traverse every column in the `Header`. This happened during every incremental evaluation for every query. With this commit, we rely on the fact that `__op_type` is always the last column if present, and check only the last column of the header.	2024-03-06 21:37:57 +00:00
John Detter	4c547f02e6	Improve generate help text (#934 ) Co-authored-by: John Detter <no-reply@boppygames.gg>	2024-03-06 17:22:16 +00:00
Phoebe Goldman	f657e95e63	Single-table subscription queries: plan once, run repeatedly (#928 ) * Single-table subscription queries: plan once, run repeatedly Prior to this commit, every call to `ExecutionUnit::eval_incr` re-invoked the query planner to convert its `eval` plan into a plan suitable for reading from a `MemTable` of updates. With this commit, specifically for single-table select queries, we invoke the query planner once during `ExecutionUnit::new`, and store the resuling `eval_incr_plan` for repeated use. A follow-up will do the same for multi-table semijoins. * Docstrings for `SourceSet::len` and `::is_empty` Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> * Compile select queries all the way to `QueryCode` ahead of time --------- Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-03-06 16:58:25 +00:00
Shubham Mishra	71d435d096	proto changes (#924 ) * proto changes * messageId -> requestId * fix oneoffquery * add metrics for subscription * requestId for reducer * timer for reducer * revert oneOffQeury related changes * request_id u32 -> bytes * use default RequestId for rust sdk * added json changes * added test * clippy	2024-03-06 16:13:52 +00:00
Noa	5bcd5e2002	Use recv_many in ws_client_actor (#913 )	2024-03-05 22:00:18 +00:00
Mazdak Farrokhzad	b485ca20e8	Re-land mult-col index selection for queries (#918 ) * Revert "Revert "Adding an index selector that take in account multi-column indexes (and improve the `query!` macro) (#694)" (#914)" This reverts commit `8e5ce79df4`. * drive-by: refactor impl From<IndexScan> for ColumnOp * reactor IndexScan bounds structure * remove temp allocation in extract_fields * skip index scan for NotEq * drive-by: simplify Select * clarify unreachable!(...) for NotEq Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com> * address Joshua's review + refactor compiler tests --------- Signed-off-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-03-05 18:39:58 +00:00
joshua-spacetime	313f592db9	perf(747): Single query execution for multiple subscriptions (#917 ) Closes #747. Before this change, we would evaluate each and every query, for each and every subscription, on each and every row update. If N subscriptions had a query Q in common, it would be evaluated N different times. With this change, distinct queries are evaluated once, and the results copied for each client. So in the example above, Q would be evaluated once, with the results transmitted to N different clients.	2024-03-05 16:52:24 +00:00
Mazdak Farrokhzad	f2a75f7f33	take AV by ref in iter_by_col_eq (#925 )	2024-03-05 10:46:54 +00:00
Phoebe Goldman	9afbfa7b3c	Remove Tables from query plans (#912 ) * First step towards removing `Table` from query plans Rework `SourceExpr` to be a logical placeholder for a table, rather than a table itself. Make query eval functions take an additional argument, a set of tables. When evaluating a `SourceExpr` to a table, they will treat the `SourceExpr` as a reference into the set of sources, and use the referred table. This commit modifies only the VM crate; modifications to `core` are forthcoming. It's possible that this commit's scheme for referring to `SourceExpr`s will need to change, as currently it forbids duplicate `SourceExpr`s, which I think might occur during index joins. * `SourceBuilder` interface for assigning `SourceId`s. * Integrate new query plan repr into `core` * Put `DbTable` back in the AST; only `MemTable` is separate Per review from Joshua, this commit makes `SourceExpr` into an enum similar to the previous definition, with a `DbTable(DbTable)` variant. Indirection to a `SourceSet` is imposed only for the `MemTable` variant. This sould make the PR's overall diff much simpler (assuming I haven't inadvertently made any changes in the process of reverting the `DbTable` code paths). Related to the above, this PR simplifies `SourceSet`. `SourceSet` now holds a `Vec<Option<MemTable>>`, where previously it was a transparent newtype around `[Option<Table>]`. This change eliminates the need for unsafe unsized conversions, removes `SourceBuilder`, and causes `SourceSet` to be uniformly consumed by the high-level query eval operators, where previously `SourceSet`s had to be semi-reusable because they could contain `DbTable`s. * Per Mazdak's review, docs!	2024-03-04 20:33:14 +00:00
Mario Montoya	7a5b66e2e3	Fix the RowCount estimation for select operator (#900 ) * Fix the RowCount estimation for select operator * Update crates/vm/src/rel_ops.rs Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Signed-off-by: Mario Montoya <mamcx@elmalabarista.com> --------- Signed-off-by: Mario Montoya <mamcx@elmalabarista.com> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-03-04 16:26:19 +00:00
John Detter	8e5ce79df4	Revert "Adding an index selector that take in account multi-column indexes (and improve the `query!` macro) (#694 )" (#914 ) This reverts commit `726080dadc`. Co-authored-by: John Detter <no-reply@boppygames.gg>	2024-03-01 20:26:24 +00:00
Mario Montoya	726080dadc	Adding an index selector that take in account multi-column indexes (and improve the `query!` macro) (#694 ) * Adding an index selector that take in account multi-column indexes (and improve the query! macro) * move select_best_index to vm/src/expr.rs; get rid of OpCmpIdx * refactor test best_index * simplify best_index* tests more * create_table_multi_index: use ColListBuilder * move & simplify create_table_multi_index * simplify assert_index_scan + uses * remove create_table, twas dead code * ColumnOpFlat: use SmallVec instead * simplify ScanIndex * simplify best_index_range * Add test for sql + joins + multi-index and fix invalid ambiguos field error * slightly refactor select_best_index * remove nonempty dependency * Add test that actually run the multi-column sql * Adding benchmark for multi vs many indexes * simplify create_table_for_test* * Add comments * impl new algo for select_best_index + clone less * improve select_best_index docs * ScanIndex -> ScanOrIndex * simplify is_sargable + use smallvec more * let make_index handle a single ScanOrIndex * make index stuff more private + remove dead code * select_best_index: return IndexColumnOp directly; nix ScanOrindex -- this removes an allocation * do not reconstruct scan argument; avoid heap allocations * borrow ColList in IndexArgument + avoid temp alloc in is_sargable * optmize_select: remove Cow from fields_found * is_sargable: reuse allocation from extract_fields * rename is_sargable, avoid temp fields_found allocs, simplify optmize_select * fix subscription benches * drive-by refactor benches/subscription * Keep a single benchmark for location * Squashed commit of the following: commit `e54b09bab2` Author: Mario Montoya <mamcx@elmalabarista.com> Date: Thu Feb 29 20:19:24 2024 -0500 Correctly show the error for AmbiguousField and simplify the code (#910) commit `48a205a818` Author: Kim Altintop <kim@eagain.io> Date: Thu Feb 29 19:32:21 2024 +0100 core: Fix host controller to not replace module if lifecyle hooks failed (#904) * core: Fix host controller to not replace module if lifecyle hooks failed Previously, `spawn_module_host` would unconditionally insert the new module into the controller state, and not remove it if the lifecycle hooks (`init_database` / `update_database`) returned an error. This would mean that the module code was replaced with the new one, even if it should be rejected because the schema was not updated or the init / update reducer failed. Fix this by starting the module, and later "committing" it to the controller state in two phases. * Add commentary about database mutations / transactionality --------- Signed-off-by: Mario Montoya <mamcx@elmalabarista.com> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-03-01 17:46:24 +00:00
Mario Montoya	e54b09bab2	Correctly show the error for AmbiguousField and simplify the code (#910 )	2024-03-01 01:19:24 +00:00
Kim Altintop	48a205a818	core: Fix host controller to not replace module if lifecyle hooks failed (#904 ) * core: Fix host controller to not replace module if lifecyle hooks failed Previously, `spawn_module_host` would unconditionally insert the new module into the controller state, and not remove it if the lifecycle hooks (`init_database` / `update_database`) returned an error. This would mean that the module code was replaced with the new one, even if it should be rejected because the schema was not updated or the init / update reducer failed. Fix this by starting the module, and later "committing" it to the controller state in two phases. * Add commentary about database mutations / transactionality	2024-02-29 18:32:21 +00:00
joshua-spacetime	b19de57522	perf(831): Remove row_pk computation from query (#908 ) * perf(831): Remove row_pk computation from query Closes #831. * perf(831): remove Op, use identical TableOp. perf(831): remove unnecesary .into_iter().map_into().collect() perf(831): cleanup evaluators and make them lazier * nix PrimaryKey & UniqueValue * add TableOp::{insert, delete} --------- Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-02-29 17:04:04 +00:00
james gilles	00aa2eedb4	Adjust benchmarks yaml to run callgrind plus with better comments (#907 )	2024-02-28 21:45:36 +00:00
Shubham Mishra	f7b0e0431a	acquire remove subscription lock in spawn_blocking (#903 ) * acquire remove subscription lock in spawn_blocking * nit Signed-off-by: Shubham Mishra <shubham@clockworklabs.io> --------- Signed-off-by: Shubham Mishra <shubham@clockworklabs.io>	2024-02-28 20:47:40 +00:00
joshua-spacetime	b2b8993e8e	perf(832): Remove redundant row deduplication in subscriptions (#863 ) * perf(832): Remove redundant row deduplication in subscriptions Closes #832. The database already operates under set semantics, so unless multiple queries return rows from the same table, deduplication of the result set is not necessary. * Rip out all deduplication --------- Co-authored-by: Noa <coolreader18@gmail.com>	2024-02-28 19:51:18 +00:00
Phoebe Goldman	0e02ad425c	WebSocket API ref: remove `row_pk`. (#29 ) Re https://github.com/clockworklabs/SpacetimeDB/pull/840 . We're removing the `row_pk` from the WebSocket API `TableRowOperation`, as computing it has a major performance impact on the server. This commit removes references to it from the WebSocket API reference.	2024-02-28 11:12:58 -08:00
Phoebe Goldman	d7a7586f71	Remove row_pk from the client API (#34 ) Re: https://github.com/clockworklabs/SpacetimeDB/pull/840 This commit updates the TypeScript SDK to no longer use the `row_pk` field in the client API, as that field no longer exists. As in the other SDKs, we replace our use of the `row_pk` with the serialized representation of the row, as this saves our needing to have objects/dicts/hash-maps keyed on domain types. Unlike the other SDKs, we support either the binary (protobuf) or JSON APIs. When using the binary API, we convert the BSATN row to a string, and use that as the `rowPk`. When using the JSON API, we `JSON.stringify` the row itself, and use that as the `rowPk`. The latter is ugly and not performant, but we don't care because the JSON API is slow anyways. This commit also removes some uses of `any` from the deserialization code, because I wanted the compiler to double-check my work.	2024-02-28 13:56:56 -05:00
Phoebe Goldman	75dc6b59b0	Remove the `rowPk` from the client API (#72 ) Re: https://github.com/clockworklabs/SpacetimeDB/pull/840 This commit updates the C# SDK to no longer use the `row_pk` field of the Protobuf client API, as that field has been removed. (Will have been removed, as of merging.) Where a table cache was previously keyed on `byte[] rowPk`, it is now keyed on `byte[] rowBytes`, where `rowBytes` is the BSATN-encoded bytes of the row. This means we effectively store two copies of each row in the client cache: the BSATN serialized format, and the decoded domain type. An alternate implementation would be to make the table caches be sets of domain types, discarding the BSATN bytes. We find this undesirable for several reasons: - Hashing and equality-comparing `byte[]` is almost certainly more efficient than doing the same for domain types. - Even if hashing and equality-comparing domain types were efficient, we would still have to update codegen to emit hashing and equality methods for all types in the module_bindings. This implementation requires no changes to the module_bindings. - We already have the BSATN bytes sitting around, as they're necessarily part of the message we recieve from the server. This change does no additional serialization or deserialization. In essence, we're trading memory for time and simplicity. Keeping the BSATN bytes live approximately doubles the table cache's memory usage, but simplifies the implementation greatly, and (we suspect) speeds up table cache insertions, deletions and lookups.	2024-02-28 13:56:38 -05:00
Phoebe Goldman	efde86d981	Remove `row_pk` from client API; hash rows not row_ids on client (#840 ) * Remove `row_pk` from client API; hash rows not row_ids on client This commit removes the `row_pk` field from `message TableRowOperation` in the client API. This is the beginning of addressing issue #831. This has several implications for clients: - Because we no longer have a stable content-addressed ID for each row, the client is responsible for hashing rows itself. - This means that generated types must be `Hash + Eq`. - This means that generated types must use `sats::F32` or `sats::F64` rather than `f32` or `f64`, as the former are `Eq + Hash` while the latter are not. (Aside: it's stupid that Rust floats are not `Eq + Ord + Hash`, since IEEE-754 defines a total equality and total ordering on floats.) Exposing `sats::F32` to users is ugly, but I can't think of a better design. This commit also makes some minor formatting changes to Rust codegen, since it was emitting code that rustc warned about. Still outstanding: - [ ] Remove row_pk from the JSON API. - [ ] Avoid computing row_pk in the subscription engine. - [ ] Update other SDKs. - [ ] C# - [ ] TypeScript - [ ] Python * Combine similar patterns Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> * Remove `row_pk` from JSON messages * fmt * SDK: ClientCache keyed on BSATN bytes, not domain types Per Ingvar's suggestion, this commit makes the client cache be `HashMap<Vec<u8>, T>`, where the key is a BSATN byte-buffer containing the serialized representation of the value. This means that generated structs and enums don't need to be `Hash + Eq`. The key upside here is that we can revert to using `f32`/`f64` rather than `sats::F32`/`sats::F64`. --------- Signed-off-by: Phoebe Goldman <phoebe@goldman-tribe.org> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com>	2024-02-28 18:56:24 +00:00
Noa	558822e9ed	Parallelize QuerySet::eval (#891 ) * Parallelize QuerySet::eval * Reduce number of arguments for make_actor	2024-02-28 06:14:30 +00:00
Tyler Cloutier	6e088a734f	Added #[tracing::instrument(skip_all)] to add_subscriber	2024-02-27 18:17:08 -08:00
Phoebe Goldman	d24ead52c6	VM: Wrap `Header` in `Arc` to avoid cloning (#897 )	2024-02-27 18:26:38 +00:00
joshua-spacetime	1904008ffc	refactor(890): Improve magic constant for index join selection (#895 ) Closes #890. This changes the magic constant from 1000 to 3000 rows, which means if the indexed table in an index join has 3000 rows or less, we make it the probe table instead.	2024-02-27 17:40:57 +00:00
Ingvar Stepanyan	f63723a845	C#: reuse Consume helper (#704 )	2024-02-27 11:25:36 +00:00
Tyler Cloutier	488d1355ac	Revert "Run initial subscriptions evals on rayon (#888 )" (#894 ) This reverts commit `0ef145d435`.	2024-02-27 02:08:52 +00:00
Noa	0ef145d435	Run initial subscriptions evals on rayon (#888 )	2024-02-26 21:14:13 +00:00
Shubham Mishra	28868647cb	serialise disconnect logic (#886 ) * serialise disconnect logic * Keep the ModuleHost::disconnect_client method --------- Co-authored-by: Noa <coolreader18@gmail.com>	2024-02-26 20:37:51 +00:00
Shubham Mishra	90032f8f58	Metric for request RTT (#872 ) * Metric for request RTT * moved metric from db to worker	2024-02-26 17:58:51 +00:00
Noa	63e97a4954	num_threads = available_concurrency (#887 )	2024-02-25 23:07:06 +00:00
james gilles	a0a18ddd57	Rename types in benchmarks (#803 ) * Rename types in benchmarks, modify benches run in action * Fix lints	2024-02-23 20:30:59 +00:00
Phoebe Goldman	87da94a9a1	HostThreadPool: don't run Rayon threads in Tokio blocking threads (#881 ) Prior to this commit, we ran our Rayon threads within the Tokio blocking threadpool. This was bad for several reasons, of which the most obvious was that it ignored the Rayon `thread_name`. Our Rayon threads still need access to the Tokio runtime (which I'm not super jazzed about; see comment), so this commit adds a custom `spawn_handler` which behaves like the Rayon default spawn handler, but also enters the Tokio runtime. This commit also adds a comment about how we're over-allocating our available parallelism, and in the future we should not do that.	2024-02-23 18:52:42 +00:00
Phoebe Goldman	77e1c68154	Script to run perf against SpacetimeDB (#882 ) * Script to run perf against SpacetimeDB * Non-controversial script improvements * No args is fine --------- Co-authored-by: Boppy <no-reply@boppygames.gg>	2024-02-22 21:53:25 +00:00
Zeke Foppa	3357d8304f	Add GitHub workflow to check for PRs with merge-blocking labels (#862 ) * [bfops/deny-merge-labels]: add gh workflow to check merge-blocking labels * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: fix? * [bfops/deny-merge-labels]: simplify	2024-02-22 00:17:35 +00:00
Noa	5de8fe06fc	Fix module import warnings (#820 ) * Fix module import warnings * Regenerate module bindings	2024-02-21 20:23:15 +00:00
Mazdak Farrokhzad	913801e22a	- Make `RelValue` into a cow-like structure. (#869 ) - Move it and friends from sats to vm. - MemTable now stores a Vec<PV>. - Other related improvements. Co-authored-by: Phoebe Goldman <phoebe@goldman-tribe.org>	2024-02-21 20:07:39 +00:00
Mario Montoya	8bd13a0046	Replace DbProgram with a leaner version that compile directly to queries without environment (#860 ) * Replace DbProgram with a leaner version that compile directly to queries without environment * Remove all the dead-code caused by removing the environment * Addressing some PR comments * Addressing some PR comments * Lint	2024-02-21 18:41:40 +00:00
joshua-spacetime	3a6f5fd288	chore: Remove instrumentation from DataKey deserializers (#852 )	2024-02-21 17:24:36 +00:00
Tyler Cloutier	c7878a7faf	Adds an isolation_level argument to begin_mut_tx calls (#864 )	2024-02-21 11:18:05 +00:00
Phoebe Goldman	c416f68650	Set name for Rayon threads (#870 ) Set a name for Rayon threads, so that they don't inherit their thread names from the parent, i.e. `tokio-runtime-w`. Use the same name for all Rayon threads, i.e. ignore the thread-index, so that tools like `perf` can merge all the Rayon threads into a single backtrace.	2024-02-21 10:33:14 +00:00
Mazdak Farrokhzad	3e2ab71540	dedup some with_label_values + reduce cloning in bootstrapping (#867 )	2024-02-20 13:12:21 +00:00
Mario Montoya	a0f40d4415	Fix boostrating ids for relating objects like indexes, caused by cache invalidation (#777 )	2024-02-20 11:01:08 +00:00
Mazdak Farrokhzad	04120e778a	system_tables, mut_tx, and friends: bye bye `to_product_value` (#851 ) * optimize build_missing_tables; collect less + use read_col * schema_for_table: don't go through PV * table_id_from_name: use read_col * relational_db tests: remove uses of to_product_value * build_sequence_state: don't use to_pv * build_indexes: don't use to_pv * remove dead code: CommittedStateIter * get_all_tables_tx: use read_col * SystemTableQuery: don't use to_pv * StModuleRow: don't use to_pv * get rd of more to_product_value calls * read_col / system_tables / mut_tx: cleanup + less work2	2024-02-19 19:21:13 +00:00
Tyler Cloutier	4a92618db8	Added extra check to create_table transactionality test (#866 )	2024-02-19 02:43:16 +00:00
Mazdak Farrokhzad	5ab4342187	Refactor some ReadColumn stuff + relational_db tests (#847 ) * refactor with read_col method + simplify InvalidFieldError creation * simplify + dedup relational_db tests * relational_db: dedup tests & nix some to_product_value call * dedup relational_db tests more	2024-02-19 01:57:36 +00:00
Noa	d1c7055665	Fix consistency issue (#861 )	2024-02-16 20:06:07 +00:00
Noa	e6cef1b627	Fix bench errors and include in CI (#855 )	2024-02-16 19:15:13 +00:00

... 45 46 47 48 49 ...

3186 Commits