99 Commits

Author SHA1 Message Date
Claude 7c3a3e0dba Put tag slug back in JS download filename
Address pirate's review: restore the slug in the client-side
download fallback filename. Expose tag.slug as data-slug on the
card element and in the search card schema so the JS can read it
directly without slugifying client-side.
2026-04-21 17:35:57 +00:00
Claude 2ea66d05d1 Move tag slug logic onto Tag.slug @property
Replaces the tag_filename_safe() helper with a Tag.slug property
that returns the slugified form via django.utils.text.slugify.
Call sites now just use tag.slug directly.
2026-04-21 17:32:25 +00:00
Claude 0041a2d407 Sanitize tag export filenames via django.utils.text.slugify
Addresses review feedback from cubic and devin: quote()'s percent-
encoding isn't decoded by browsers in Content-Disposition's filename
parameter (Safari saves literal %20). Switch to Django's slugify()
which does NFKD normalization, ASCII transliteration, and replaces
punctuation with hyphens — producing clean names like
"tag-alpha-research-urls.txt".

- Add tag_filename_safe(name) helper wrapping slugify
- Use it in both tag export endpoints
- Drop the now-unneeded JS fallback name (server always sets
  Content-Disposition)
2026-04-21 17:30:50 +00:00
Claude b83e2de73a Add TODO on tag export filename encoding
Applies pirate's review suggestion on PR #1789: mark the
Content-Disposition filename encoding as a known-rough approach
that could be hardened further (strip punctuation, convert to
ASCII equivalents) in a follow-up.
2026-04-21 17:27:06 +00:00
Claude ec9c7c89f4 Drop Tag slug column and use URL-encoded names
Tags now support full unicode with no restrictions. URL-encode the tag
name wherever it previously used the slug (export filenames, lookups).

- Remove `slug` field, `_generate_unique_slug`, and slug handling in save()
- Add migration 0034 to drop the slug column
- `get_tag_by_ref` now resolves by URL-decoded exact name match
- Tag search/autocomplete/export filenames use the name directly
- Drop slug from admin search_fields/readonly_fields/fieldsets
- Remove slug display from similar-tag cards and client download filename
2026-04-20 17:05:07 +00:00
Nick Sweeting b749b26c5d wip 2026-03-23 03:58:32 -07:00
Nick Sweeting f400a2cd67 WIP: checkpoint working tree before rebasing onto dev 2026-03-22 20:25:18 -07:00
Nick Sweeting 57e11879ec cleanup archivebox tests 2026-03-15 22:09:56 -07:00
Nick Sweeting 49436af869 Tighten CLI and admin typing 2026-03-15 19:33:15 -07:00
Nick Sweeting 5381f7584c Tighten API typing and add return values 2026-03-15 19:24:54 -07:00
Nick Sweeting 95a105feb9 small fixes 2026-03-15 19:22:06 -07:00
Nick Sweeting f932054915 add stricter locking around stage machine models 2026-03-15 19:21:41 -07:00
Nick Sweeting 934e02695b fix lint 2026-03-15 18:45:29 -07:00
Nick Sweeting 70c9358cf9 Improve scheduling, runtime paths, and API behavior 2026-03-15 18:31:56 -07:00
Nick Sweeting 7d42c6c8b5 bump versions and fix docs 2026-03-15 17:43:07 -07:00
Nick Sweeting ec4b27056e wip 2026-01-21 03:19:56 -08:00
Nick Sweeting dd77511026 unified Process source of truth and better screenshot tests 2026-01-02 04:20:34 -08:00
Nick Sweeting 099d955ef5 Implement tags editor widget for Django admin (#1729)
Implement a sleek inline tag editor with autocomplete and AJAX support:

- Create TagEditorWidget and InlineTagEditorWidget in core/widgets.py
  - Pills display with X remove button, sorted alphabetically
  - Text input with HTML5 datalist autocomplete
  - Enter/Space/Comma to add tags, auto-creates if doesn't exist
  - Backspace removes last tag when input is empty

- Add API endpoints in api/v1_core.py
  - GET /tags/autocomplete/ - search tags by name
  - POST /tags/create/ - get_or_create tag
  - POST /tags/add-to-snapshot/ - add tag to snapshot via AJAX
  - POST /tags/remove-from-snapshot/ - remove tag from snapshot

- Update admin_snapshots.py
  - Replace FilteredSelectMultiple with TagEditorWidget in bulk actions
  - Create SnapshotAdminForm with tags_editor field
  - Update title_str() to render inline tag editor in list view
  - Remove TagInline, use widget instead

- Add CSS styles in templates/admin/base.html
  - Blue gradient pill styling matching admin theme
  - Focus ring and hover states
  - Compact inline variant for list view

<!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line
length changes. -->

# Summary

<!--e.g. This PR fixes ABC or adds the ability to do XYZ...-->

# Related issues

<!-- e.g. #123 or Roadmap goal #
https://github.com/pirate/ArchiveBox/wiki/Roadmap -->

# Changes these areas

- [ ] Bugfixes
- [ ] Feature behavior
- [ ] Command line interface
- [ ] Configuration options
- [ ] Internal architecture
- [ ] Snapshot data layout on disk

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Implemented a new interactive tags editor for Django admin with
autocomplete and AJAX add/remove, replacing the old multi-select and
inline. This makes tagging snapshots faster and safer in detail, list,
and bulk actions.

- **New Features**
- TagEditorWidget and InlineTagEditorWidget with pill UI and remove
buttons, XSS-safe rendering, and delegated events.
- Keyboard support: Enter/Space/Comma to add, Backspace to remove last
when input is empty.
- Datalist autocomplete and debounced search via GET
/tags/autocomplete/.
- AJAX endpoints: POST /tags/create/, /tags/add-to-snapshot/,
/tags/remove-from-snapshot/.

- **Refactors**
- Replaced FilteredSelectMultiple with TagEditorWidget in bulk actions;
parse comma-separated tags and use bulk_create/delete for efficient
add/remove.
- Added SnapshotAdminForm with tags_editor field; saves tags
case-insensitively and fixes remove_tags matching.
- Rendered inline tag editor in list view via title_str; removed
TagInline.
- Added CSS in admin/base.html for pill styling, focus ring, and compact
inline variant.

<sup>Written for commit 0dee662f41.
Summary will update on new commits.</sup>

<!-- End of auto-generated description by cubic. -->
2025-12-30 11:59:39 -08:00
Nick Sweeting 95beddc5fc more migration fixes 2025-12-29 22:12:57 -08:00
Nick Sweeting 80f75126c6 more fixes 2025-12-29 21:03:05 -08:00
Claude 202e5b2e59 Add interactive tags editor widget for Django admin
Implement a sleek inline tag editor with autocomplete and AJAX support:

- Create TagEditorWidget and InlineTagEditorWidget in core/widgets.py
  - Pills display with X remove button, sorted alphabetically
  - Text input with HTML5 datalist autocomplete
  - Enter/Space/Comma to add tags, auto-creates if doesn't exist
  - Backspace removes last tag when input is empty

- Add API endpoints in api/v1_core.py
  - GET /tags/autocomplete/ - search tags by name
  - POST /tags/create/ - get_or_create tag
  - POST /tags/add-to-snapshot/ - add tag to snapshot via AJAX
  - POST /tags/remove-from-snapshot/ - remove tag from snapshot

- Update admin_snapshots.py
  - Replace FilteredSelectMultiple with TagEditorWidget in bulk actions
  - Create SnapshotAdminForm with tags_editor field
  - Update title_str() to render inline tag editor in list view
  - Remove TagInline, use widget instead

- Add CSS styles in templates/admin/base.html
  - Blue gradient pill styling matching admin theme
  - Focus ring and hover states
  - Compact inline variant for list view
2025-12-30 02:18:08 +00:00
Nick Sweeting 30c60eef76 much better tests and add page ui 2025-12-29 04:02:11 -08:00
Nick Sweeting f4e7820533 use full dotted paths for all archivebox imports, add migrations and more fixes 2025-12-29 00:47:08 -08:00
Nick Sweeting f0aa19fa7d wip 2025-12-28 17:51:54 -08:00
Nick Sweeting bd265c0083 rename extractor to plugin everywhere 2025-12-28 04:43:15 -08:00
Nick Sweeting 50e527ec65 way better plugin hooks system wip 2025-12-28 03:39:59 -08:00
Claude b632894bc9 Update views, API, and exports for new ArchiveResult output fields
Replace old `output` field with new fields across the codebase:
- output_str: Human-readable output summary
- output_json: Structured metadata (optional)
- output_files: Dict of output files with metadata
- output_size: Total size in bytes
- output_mimetypes: CSV of file mimetypes

Files updated:
- api/v1_core.py: Update MinimalArchiveResultSchema to expose new fields
- api/v1_core.py: Update ArchiveResultFilterSchema to search output_str
- cli/archivebox_extract.py: Use output_str in CLI output
- core/admin_archiveresults.py: Update admin fields, search, and fieldsets
- core/admin_archiveresults.py: Fix output_html variable name bug in output_summary
- misc/jsonl.py: Update archiveresult_to_jsonl() to include new fields
- plugins/extractor_utils.py: Update ExtractorResult helper class

The embed_path() method already uses output_files and output_str,
so snapshot detail page and template tags work correctly.
2025-12-27 20:28:22 +00:00
Claude 13be196fd7 Merge remote-tracking branch 'origin/dev' into claude/improve-test-suite-xm6Bh
# Conflicts:
#	pyproject.toml
2025-12-27 02:27:51 +00:00
Nick Sweeting e2cbcd17f6 more tests and migrations fixes 2025-12-26 18:22:48 -08:00
Claude ae2ab5b273 Add Python 3.13 support with uuid7 backport compatibility
- Create uuid_compat.py module that provides uuid7 for Python <3.14
  using uuid_extensions package, and native uuid.uuid7 for Python 3.14+
- Update all model files and migrations to use archivebox.uuid_compat
- Add uuid7 conditional dependency in pyproject.toml for Python <3.14
- Update requires-python to >=3.13 (from >=3.14)
- Update GitHub workflows, lock_pkgs.sh to use Python 3.13
- Update tool configs (ruff, pyright, uv) for Python 3.13

This enables running ArchiveBox on Python 3.13 while maintaining
forward compatibility with Python 3.14's native uuid7 support.
2025-12-27 01:07:30 +00:00
Nick Sweeting bb53228ebf remove Seed model in favor of Crawl as template 2025-12-25 01:52:41 -08:00
Nick Sweeting 866f993f26 logging and admin ui improvements 2025-12-25 01:10:41 -08:00
Nick Sweeting d95f0dc186 remove huey 2025-12-24 23:40:18 -08:00
Nick Sweeting 6c769d831c wip 2 2025-12-24 21:46:14 -08:00
Nick Sweeting 1915333b81 wip major changes 2025-12-24 20:10:38 -08:00
Nick Sweeting c1335fed37 Remove ABID system and KVTag model - use UUIDv7 IDs exclusively
This commit completes the simplification of the ID system by:

- Removing the ABID (ArchiveBox ID) system entirely
- Removing the base_models/abid.py file
- Removing KVTag model in favor of the existing Tag model in core/models.py
- Simplifying all models to use standard UUIDv7 primary keys
- Removing ABID-related admin functionality
- Cleaning up commented-out ABID code from views and statemachines
- Deleting migration files for ABID field removal (no longer needed)

All models now use simple UUIDv7 ids via `id = models.UUIDField(primary_key=True, default=uuid7)`

Note: Old migrations containing ABID references are preserved for database
migration history compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 06:13:49 -08:00
Nick Sweeting c374d7695e allow getting crawl from API as rss feed 2024-12-03 02:13:45 -08:00
Nick Sweeting 328eb98a38 move main funcs into cli files and switch to using click for CLI 2024-11-19 00:18:51 -08:00
Nick Sweeting 569081a9eb rename abid_utils to base_models 2024-11-18 19:40:05 -08:00
Nick Sweeting 65afd405b1 merge seeds and crawls apps 2024-11-18 19:23:14 -08:00
Nick Sweeting e469c5a344 merge queues and actors apps into new workers app 2024-11-18 18:52:48 -08:00
Nick Sweeting eeb2671e4d API improvements 2024-11-18 04:27:38 -08:00
Nick Sweeting c7bd9449d5 better jobs dashboard with faster refresh 2024-11-18 04:27:38 -08:00
Nick Sweeting eb53145e4e working state machine flow yay 2024-11-18 04:27:38 -08:00
Nick Sweeting 1e3ce67834 fix API and CLU calls 2024-11-18 04:27:38 -08:00
Nick Sweeting b852442efc add crawls app back to django admin 2024-11-18 04:27:37 -08:00
Nick Sweeting 8f8fbbb7a2 API fixes and add actors endpoints 2024-11-17 20:09:06 -08:00
Nick Sweeting 43514da0d0 add crawl and seed endpoints to REST API 2024-11-16 02:45:11 -08:00
Ben Muthalaly 4213d7dc27 Fix API crash 2024-10-26 01:53:49 -05:00
Nick Sweeting c0b7887fd7 fix admin registration using abx hooks 2024-10-14 17:38:38 -07:00