# Description of Changes
This PR implements support for the `spacetime.json` configuration file
that can be used to set up common `generate` and `publish` targets. An
example of `spacetime.json` could look like this:
```
{
"dev_run": "pnpm dev",
"generate": [
{ "out-dir": "./foobar", "module-path": "region-module", "language": "c-sharp" },
{ "out-dir": "./global", "module-path": "global-module", "language": "c-sharp" },
],
"publish": {
"database": "bitcraft",
"module-path": "spacetimedb",
"server": "local",
"children": [
{ "database": "region-1", "module-path": "region-module", server: "local" },
{ "database": "region-2", "module-path": "region-module", server: "local" }
]
}
}
```
With this config, running `spacetime generate` without any arguments
would generate bindings for two targets: `region-module` and
`global-module`. `spacetime publish` without any arguments would publish
three modules, starting from the parent: `bitcraft`, `region-1`, and
`region-2`. On top of that, the command `pnpm dev` would be executed
when using `spacetime dev`.
It is also possible to pass additional command line arguments when
calling the `publish` and `generate` commands, but there are certain
limitations. There is a special case when passing either a module path
to generate or a module name to publish. Doing that will filter out
entries in the config file that do not match. For example, running:
```
spacetime generate --project-path global-module
```
would only generate bindings for the second entry in the `generate`
list.
In a similar fashion, running:
```
spacetime publish region-1
```
would only publish the child database with the name `region-1`
Passing other existing arguments is also possible, but not all of the
arguments are available for multiple configs. For example, when running
`spacetime publish --server maincloud`, the publish command would be
applied to all of the modules listed in the config file, but the
`server` value from the command line arguments would take precedence.
Running with arguments like `--bin-path` would, however, would throw an
error as `--bin-path` makes sense only in a context of a specific
module, thus this wouldn't work: `spacetime publish --bin-path
spacetimedb/target/debug/bitcraft.wasm`. I will throw an error unless
there is only one entry to process, thus `spacetime publish --bin-path
spacetimedb/target/debug/bitcraft.wasm bitcraft` would work, as it
filters the publish targets to one entry.
# API and ABI breaking changes
None
# Expected complexity level and risk
3
The config file in itself is not overly complex, but when coupled with
the CLI it is somewhat tricky to get right. There are also some changes
that I had to make to how clap arguments are validated - because the
values can now come from both the config file and the clap config, we
can't use some of the built-in validations like `required`, or at least
I haven't found a clean way to do so.
# Testing
I've added some automated tests, but more tests and manual testing is
coming.
---------
Signed-off-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
Co-authored-by: bradleyshep <148254416+bradleyshep@users.noreply.github.com>
Co-authored-by: clockwork-labs-bot <clockwork-labs-bot@users.noreply.github.com>
Co-authored-by: = <cloutiertyler@gmail.com>
Co-authored-by: Tyler Cloutier <cloutiertyler@users.noreply.github.com>
AI One-Shot App Generation
This project benchmarks how well Cursor rules enable AI to one-shot SpacetimeDB apps — generate and deploy a working app in a single attempt.
Purpose
This benchmark compares AI-generated apps across two platforms:
- SpacetimeDB — Real-time database with automatic client sync
- PostgreSQL — Traditional database requiring manual WebSocket broadcasting
By generating equivalent apps for both platforms, we can evaluate how well Cursor rules guide the AI to produce working SpacetimeDB applications compared to a familiar baseline (PostgreSQL).
How to Run a Benchmark
Prerequisites
- Install Cursor IDE (free download)
- Have a Cursor subscription or API credits for the model you want to test
- For SpacetimeDB tests: install the SpacetimeDB CLI
- For PostgreSQL tests: have Docker installed (for the database container)
Step-by-Step Instructions
-
Open this folder as a workspace in Cursor
- File → Open Folder → select
tools/llm-oneshot - This folder must be the workspace root so Cursor loads the
.cursor/rules/files
- File → Open Folder → select
-
Open a new Agent chat
- Press
Ctrl+I(Windows/Linux) orCmd+I(Mac) to open the AI panel - Or click the Cursor icon in the sidebar
- Press
-
Select your model
- Click the model dropdown at the bottom of the chat panel
- Choose the model you want to benchmark (e.g., Claude Opus 4.5, GPT-5, Gemini 3 Pro)
-
Add the prompt files
Drag these two files from the file explorer directly into the chat:
apps/chat-app/prompts/language/typescript-spacetime.md(or your desired stack)apps/chat-app/prompts/composed/12_full.md(or your desired feature level)
Then type this message:
Read all rules first. Do not reference AI-generated apps in apps/ for guidance. Execute these prompts. -
Let the AI generate the app
- Press Enter to send the prompt
- The AI will read the rules, then generate the backend and client code
- Do not interrupt — let it complete the full generation
-
Deploy when prompted
- The AI will ask if you want to deploy (Local / Cloud / Skip)
- Choose "Local" to test the app on your machine
Why isolate from existing apps? To ensure clean results. If the AI references previous attempts, we can't tell whether success came from the rules or from copying.
Example Configurations
TypeScript + SpacetimeDB (full features):
- Language:
apps/chat-app/prompts/language/typescript-spacetime.md - Level:
apps/chat-app/prompts/composed/12_full.md
TypeScript + PostgreSQL (full features):
- Language:
apps/chat-app/prompts/language/typescript-postgres.md - Level:
apps/chat-app/prompts/composed/12_full.md
Available Stacks
| Language File | Stack |
|---|---|
typescript-spacetime.md |
TypeScript + SpacetimeDB (React) |
typescript-postgres.md |
TypeScript + PostgreSQL (Express) |
Feature Levels
Each level is cumulative.
| Level | Features Added |
|---|---|
| 01 | Basic Chat, Typing, Read Receipts, Unread |
| 02 | + Scheduled Messages |
| 03 | + Ephemeral Messages |
| 04 | + Reactions |
| 05 | + Edit History |
| 06 | + Permissions |
| 07 | + Presence |
| 08 | + Threading |
| 09 | + Private Rooms |
| 10 | + Activity Indicators |
| 11 | + Draft Sync |
| 12 | + Anonymous Migration (ALL) |
After Generation
The AI will ask (per deployment.mdc rules):
- Deploy? — Local / Cloud / Skip
- Grade? — AI reviews the code and writes a
GRADING_RESULTS.mdfile
Grading
Grading is done manually, with AI doing a shallow pass before manual review. The grading rubric is in apps/{app}/prompts/grading_rubric.md.
Each graded app gets a GRADING_RESULTS.md file in its folder.
Aggregating Results
To generate summary reports from all graded apps:
cd tools/llm-oneshot
pnpm install
pnpm run summarize
This outputs to docs/llms/:
oneshot-summary.md— Combined summary with feature scoresoneshot-grades.json— Structured data for websites
Folder Structure
Generated apps are stored in:
apps/{app-name}/{language}/{model}/{platform}/{app-name}-{YYYYMMDD-HHMMSS}/
Example:
apps/chat-app/typescript/opus-4-5/spacetime/chat-app-20260107-120000/
apps/chat-app/typescript/opus-4-5/postgres/chat-app-20260108-140000/
This structure allows comparing results across:
- Apps — chat-app, paint-app
- Models — opus-4-5, grok-code, gemini-3-pro, gpt-5-2
- Platforms — spacetime vs postgres