ArchiveBox

mirror of https://github.com/ArchiveBox/ArchiveBox.git synced 2026-06-22 11:31:05 -04:00

Author	SHA1	Message	Date
Nick Sweeting	0e0759a680	Skip binary probes outside data dirs	2026-06-21 05:42:26 -07:00
Nick Sweeting	59a956bf9f	Show dynamic DB binaries in version output	2026-06-21 05:24:15 -07:00
Nick Sweeting	9dbeece35b	fix version command exit status	2026-06-21 02:41:18 -07:00
Nick Sweeting	d259ac8095	fix machine network interface identity	2026-06-21 02:29:53 -07:00
Nick Sweeting	00e818018a	release: v0.9.35rc46	2026-06-21 01:44:30 -07:00
Nick Sweeting	8b57085827	chore: commit local archivebox changes	2026-06-14 11:44:38 -07:00
Nick Sweeting	6f321797c9	fix: publish dev build with updated package deps	2026-06-14 09:34:32 -07:00
Nick Sweeting	88a92b6548	fix: keep version informational without installed binaries	2026-06-14 09:28:41 -07:00
Nick Sweeting	0ddda66ee3	release: archivebox 0.9.35rc35	2026-06-14 08:55:48 -07:00
Nick Sweeting	1dbde4776f	release: archivebox 0.9.35rc27	2026-06-13 23:12:36 -07:00
Nick Sweeting	e547abbf27	Align direct URL Crawl flow with historical depth=0 convention archivebox add and other entry points now seed Crawl.urls as CrawlSeed JSONL at depth=0 (the input layer) with max_depth=depth for direct URLs and depth+1 only for stdin/import text where the synthetic archivebox://internal root lives at depth=0. The runner also accepts one plain URL per line for ORM/crawl-create/schedule callers so every Crawl row goes through the same expansion path without scattering CrawlSeed knowledge across the codebase. Tests updated to match restored convention.	2026-06-13 18:04:45 -07:00
Nick Sweeting	c9e63ccffd	Centralize synthetic root snapshot creation in CrawlRunner Direct URL inputs from CLI/UI/API now seed Crawl.urls as explicit {type:CrawlSeed,url,depth} JSONL rows; raw stdin/UI/API import text stays verbatim. The runner's create_initial_snapshots() is now the single place that either expands seed rows or creates the synthetic archivebox://internal root + staticfile/stdin.txt, so add paths no longer perform DB/FS side effects and the parser hooks run through the same Snapshot lifecycle as every other extractor.	2026-06-13 17:10:06 -07:00
Nick Sweeting	6a635d3cd6	Fix UI add direct URL runner flow	2026-06-11 09:28:28 -07:00
Nick Sweeting	14d43c88f5	Repair stale binaries in version output	2026-06-11 07:52:05 -07:00
Nick Sweeting	b6921d5e03	Restore direct URL add snapshots	2026-06-11 06:21:46 -07:00
Nick Sweeting	f6c98b67d7	Limit stale binary checks to version output	2026-06-11 01:32:30 -07:00
Nick Sweeting	98aecfaf91	release: archivebox 0.9.35rc8	2026-06-09 23:12:22 -07:00
Nick Sweeting	2bfb3ad4eb	release: archivebox 0.9.35rc5	2026-06-09 22:09:28 -07:00
Nick Sweeting	a1449c2822	Use plugin URL patterns for source imports	2026-06-08 23:40:25 -07:00
Nick Sweeting	5adec53b4c	Expand add flow runtime handling	2026-06-08 23:27:08 -07:00
Nick Sweeting	4fa90e484a	release: archivebox 0.9.34rc71	2026-06-07 20:51:28 -07:00
Nick Sweeting	2659f20dc4	fix runner takeover for scoped snapshot workers	2026-06-07 11:41:17 -07:00
Nick Sweeting	1ba5281343	release: archivebox 0.9.34rc68	2026-06-07 04:19:40 -07:00
Nick Sweeting	87b518314a	release: archivebox 0.9.34rc67	2026-06-07 04:06:45 -07:00
Nick Sweeting	1b19736b2f	Recover interrupted hook work by hook identity	2026-06-05 03:25:38 -07:00
Nick Sweeting	4e4ee8cdb0	fix runner stdin and update maintenance lifecycle	2026-06-04 22:38:22 -07:00
Nick Sweeting	73587a1a4d	use archivebox plugin discovery for extraction queues	2026-06-04 21:57:33 -07:00
Nick Sweeting	7f8af6357d	fix index and binary runner lifecycle	2026-06-04 21:40:58 -07:00
Nick Sweeting	c0fb8eb532	release: archivebox 0.9.34rc39	2026-06-03 17:19:48 -07:00
Nick Sweeting	83a2099851	fix: scope update search backfill runner	2026-06-02 21:54:25 -07:00
Nick Sweeting	3669133a05	fix: allow install to initialize collections	2026-06-02 21:25:20 -07:00
Nick Sweeting	9f0544857c	test: require success in cli workflows	2026-06-02 21:19:21 -07:00
Nick Sweeting	e4ec848da8	fix: keep background add crawls runnable	2026-06-02 18:53:59 -07:00
Nick Sweeting	39eac65ed0	Stabilize frozen config CLI test flows (cherry picked from commit `2bca869e32`)	2026-06-02 18:44:57 -07:00
Nick Sweeting	b46d142cc6	test cleanup	2026-06-02 12:13:47 -07:00
Nick Sweeting	96437e1ffd	Publish local ArchiveBox changes	2026-06-02 02:25:52 -07:00
Nick Sweeting	7dd738b5b7	release: archivebox 0.9.34rc37	2026-06-01 21:44:23 -07:00
Nick Sweeting	ac6e018672	release: archivebox 0.9.34rc34	2026-06-01 19:23:06 -07:00
Nick Sweeting	065fcfc0ba	Preserve queued index jobs during reindex	2026-06-01 15:29:38 -07:00
Nick Sweeting	c075d654d8	Consolidate runtime config handling	2026-06-01 15:03:40 -07:00
Nick Sweeting	453d998e7d	fix: schedule background admin crawls	2026-06-01 10:44:21 -07:00
Nick Sweeting	72a67bd511	Project abxpkg binary events	2026-06-01 02:02:42 -07:00
Nick Sweeting	cab05eb1c6	Refactor plugins search progress and config flows	2026-06-01 00:08:27 -07:00
Nick Sweeting	9bcba41b58	release: archivebox 0.9.33rc58	2026-05-31 04:38:45 -07:00
Nick Sweeting	83d5161b3e	release: v0.9.33rc51	2026-05-31 01:14:40 -07:00
Nick Sweeting	5a38193f56	release: archivebox 0.9.33rc50	2026-05-30 22:27:28 -07:00
Nick Sweeting	6ce2555dfd	fix: rename utils.py → util.py across modules, fix add --index-only, misc cleanups Renames (no functional change, just consistency with the rest of the codebase): - cli/cli_utils.py → cli/cli_util.py - core/host_utils.py → core/host_util.py - core/tag_utils.py → core/tag_util.py - crawls/schedule_utils.py → crawls/schedule_util.py - machine/env_utils.py → machine/env_util.py Functional fixes: - archivebox add --index-only now materializes Snapshot rows synchronously via crawl.create_snapshots_from_urls() instead of just queueing the Crawl and leaving the index empty. The previous behavior broke every test that expected --index-only to populate the index, since the runner is never started in index-only mode. - config/collection.py: add _coerce_from_str_dict as the inverse of _coerce_to_str_dict so JSON-encoded INI values are decoded back to native dict/list types when mirrored into Machine.config (a JSONField). Without this, downstream consumers like MachineEvent / abx-dl get raw JSON strings where they expect dicts. Plus matching admin / middleware / model touch-ups, the registration password_change_form template, and assorted small cleanups the user worked through while validating the deploy path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-30 14:30:33 -07:00
Nick Sweeting	383e4b5c6e	release: archivebox 0.9.33rc45	2026-05-30 05:42:53 -07:00
Nick Sweeting	b0a47e8bf5	wip: snapshot live progress, universal --init, runner perms, supervisord SIGINT - Snapshot detail page: embed scoped live-progress monitor (same-origin /progress.json on whichever host the page is served from); hide admin action buttons when scoped; per-snapshot perms via can_view_snapshot. - crawl_file API: respect crawl-level permissions; PUBLIC/UNLISTED served to guests, PRIVATE returns 404 for non-admin/non-owner. - CrawlRunner: replace allow_paused_snapshot_maintenance with allow_maintenance_on_inactive_crawl so SEALED crawls don't short-circuit the cancellation guard for legitimate maintenance hooks (search backend backfill, fs migration, etc.). Fixes infinite STARTED loop on snapshots with queued search_backend results. - Universal `--init` flag: works on any subcommand (server, update, add, shell, install, ...). Detected at module load, stripped from argv, and consumed in the dispatcher so subprocesses inherit a clean env. - supervisord_util.run_runner_worker: route Ctrl+C through supervisor.signalProcess(name, "SIGINT") instead of raw os.kill on a cached pid, gated on statename=RUNNING. Prevents killing unrelated processes when the worker's pid has been reused by the OS. - Login page: remove non-functional password-reset links; add has_real_admin_users template tag to gate the bootstrap hint. - Add page: hide underline on the "Get the extension" link. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-30 04:45:15 -07:00
Nick Sweeting	2d2b8ff047	release: archivebox 0.9.33rc39	2026-05-29 03:53:41 -07:00

1 2 3 4 5 ...

343 Commits