Direct URL inputs from CLI/UI/API now seed Crawl.urls as explicit
{type:CrawlSeed,url,depth} JSONL rows; raw stdin/UI/API import text
stays verbatim. The runner's create_initial_snapshots() is now the
single place that either expands seed rows or creates the synthetic
archivebox://internal root + staticfile/stdin.txt, so add paths no
longer perform DB/FS side effects and the parser hooks run through
the same Snapshot lifecycle as every other extractor.
Renames (no functional change, just consistency with the rest of the codebase):
- cli/cli_utils.py → cli/cli_util.py
- core/host_utils.py → core/host_util.py
- core/tag_utils.py → core/tag_util.py
- crawls/schedule_utils.py → crawls/schedule_util.py
- machine/env_utils.py → machine/env_util.py
Functional fixes:
- archivebox add --index-only now materializes Snapshot rows synchronously
via crawl.create_snapshots_from_urls() instead of just queueing the Crawl
and leaving the index empty. The previous behavior broke every test that
expected --index-only to populate the index, since the runner is never
started in index-only mode.
- config/collection.py: add _coerce_from_str_dict as the inverse of
_coerce_to_str_dict so JSON-encoded INI values are decoded back to native
dict/list types when mirrored into Machine.config (a JSONField). Without
this, downstream consumers like MachineEvent / abx-dl get raw JSON
strings where they expect dicts.
Plus matching admin / middleware / model touch-ups, the registration
password_change_form template, and assorted small cleanups the user
worked through while validating the deploy path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Snapshot detail page: embed scoped live-progress monitor (same-origin
/progress.json on whichever host the page is served from); hide admin
action buttons when scoped; per-snapshot perms via can_view_snapshot.
- crawl_file API: respect crawl-level permissions; PUBLIC/UNLISTED served
to guests, PRIVATE returns 404 for non-admin/non-owner.
- CrawlRunner: replace allow_paused_snapshot_maintenance with
allow_maintenance_on_inactive_crawl so SEALED crawls don't short-circuit
the cancellation guard for legitimate maintenance hooks (search backend
backfill, fs migration, etc.). Fixes infinite STARTED loop on snapshots
with queued search_backend results.
- Universal `--init` flag: works on any subcommand (server, update, add,
shell, install, ...). Detected at module load, stripped from argv, and
consumed in the dispatcher so subprocesses inherit a clean env.
- supervisord_util.run_runner_worker: route Ctrl+C through
supervisor.signalProcess(name, "SIGINT") instead of raw os.kill on a
cached pid, gated on statename=RUNNING. Prevents killing unrelated
processes when the worker's pid has been reused by the OS.
- Login page: remove non-functional password-reset links; add
has_real_admin_users template tag to gate the bootstrap hint.
- Add page: hide underline on the "Get the extension" link.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address pirate's review: restore the slug in the client-side
download fallback filename. Expose tag.slug as data-slug on the
card element and in the search card schema so the JS can read it
directly without slugifying client-side.
Replaces the tag_filename_safe() helper with a Tag.slug property
that returns the slugified form via django.utils.text.slugify.
Call sites now just use tag.slug directly.
Addresses review feedback from cubic and devin: quote()'s percent-
encoding isn't decoded by browsers in Content-Disposition's filename
parameter (Safari saves literal %20). Switch to Django's slugify()
which does NFKD normalization, ASCII transliteration, and replaces
punctuation with hyphens — producing clean names like
"tag-alpha-research-urls.txt".
- Add tag_filename_safe(name) helper wrapping slugify
- Use it in both tag export endpoints
- Drop the now-unneeded JS fallback name (server always sets
Content-Disposition)
Applies pirate's review suggestion on PR #1789: mark the
Content-Disposition filename encoding as a known-rough approach
that could be hardened further (strip punctuation, convert to
ASCII equivalents) in a follow-up.
Tags now support full unicode with no restrictions. URL-encode the tag
name wherever it previously used the slug (export filenames, lookups).
- Remove `slug` field, `_generate_unique_slug`, and slug handling in save()
- Add migration 0034 to drop the slug column
- `get_tag_by_ref` now resolves by URL-decoded exact name match
- Tag search/autocomplete/export filenames use the name directly
- Drop slug from admin search_fields/readonly_fields/fieldsets
- Remove slug display from similar-tag cards and client download filename
Implement a sleek inline tag editor with autocomplete and AJAX support:
- Create TagEditorWidget and InlineTagEditorWidget in core/widgets.py
- Pills display with X remove button, sorted alphabetically
- Text input with HTML5 datalist autocomplete
- Enter/Space/Comma to add tags, auto-creates if doesn't exist
- Backspace removes last tag when input is empty
- Add API endpoints in api/v1_core.py
- GET /tags/autocomplete/ - search tags by name
- POST /tags/create/ - get_or_create tag
- POST /tags/add-to-snapshot/ - add tag to snapshot via AJAX
- POST /tags/remove-from-snapshot/ - remove tag from snapshot
- Update admin_snapshots.py
- Replace FilteredSelectMultiple with TagEditorWidget in bulk actions
- Create SnapshotAdminForm with tags_editor field
- Update title_str() to render inline tag editor in list view
- Remove TagInline, use widget instead
- Add CSS styles in templates/admin/base.html
- Blue gradient pill styling matching admin theme
- Focus ring and hover states
- Compact inline variant for list view
<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->
# Summary
<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->
# Related issues
<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->
# Changes these areas
- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Implemented a new interactive tags editor for Django admin with
autocomplete and AJAX add/remove, replacing the old multi-select and
inline. This makes tagging snapshots faster and safer in detail, list,
and bulk actions.
- **New Features**
- TagEditorWidget and InlineTagEditorWidget with pill UI and remove
buttons, XSS-safe rendering, and delegated events.
- Keyboard support: Enter/Space/Comma to add, Backspace to remove last
when input is empty.
- Datalist autocomplete and debounced search via GET
/tags/autocomplete/.
- AJAX endpoints: POST /tags/create/, /tags/add-to-snapshot/,
/tags/remove-from-snapshot/.
- **Refactors**
- Replaced FilteredSelectMultiple with TagEditorWidget in bulk actions;
parse comma-separated tags and use bulk_create/delete for efficient
add/remove.
- Added SnapshotAdminForm with tags_editor field; saves tags
case-insensitively and fixes remove_tags matching.
- Rendered inline tag editor in list view via title_str; removed
TagInline.
- Added CSS in admin/base.html for pill styling, focus ring, and compact
inline variant.
<sup>Written for commit 0dee662f41.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Implement a sleek inline tag editor with autocomplete and AJAX support:
- Create TagEditorWidget and InlineTagEditorWidget in core/widgets.py
- Pills display with X remove button, sorted alphabetically
- Text input with HTML5 datalist autocomplete
- Enter/Space/Comma to add tags, auto-creates if doesn't exist
- Backspace removes last tag when input is empty
- Add API endpoints in api/v1_core.py
- GET /tags/autocomplete/ - search tags by name
- POST /tags/create/ - get_or_create tag
- POST /tags/add-to-snapshot/ - add tag to snapshot via AJAX
- POST /tags/remove-from-snapshot/ - remove tag from snapshot
- Update admin_snapshots.py
- Replace FilteredSelectMultiple with TagEditorWidget in bulk actions
- Create SnapshotAdminForm with tags_editor field
- Update title_str() to render inline tag editor in list view
- Remove TagInline, use widget instead
- Add CSS styles in templates/admin/base.html
- Blue gradient pill styling matching admin theme
- Focus ring and hover states
- Compact inline variant for list view