## Summary
- add a vanilla HTML/CSS landing page under repo-root `publicsite/`
- keep the existing ArchiveBox logo and custom domain CNAME in the Pages
artifact
- use the light-mode ArchiveBox design tokens with no dark-mode CSS
- update the GitHub Pages workflow to deploy `./publicsite` directly
without Jekyll
- remove the old top-level `website/` tree and duplicate Jekyll Pages
workflow
## Validation
- `ruby -e "require 'yaml';
YAML.load_file('.github/workflows/gh-pages.yml')"`
- parsed `publicsite/index.html` with Python `HTMLParser`
- served `publicsite` locally and verified `/`, `styles.css`,
`icon.png`, and `CNAME` return 200
## Summary
This PR removes the `slug` field from the Tag model and all related slug
generation logic. Tags are now identified and referenced by their name
instead of a generated slug, simplifying the data model and reducing
complexity.
## Related issues
N/A
## Changes these areas
- [x] Internal architecture
- [x] Snapshot data layout on disk
## Details
### What changed
1. **Model changes**: Removed the `slug` field from the Tag model,
including the `_generate_unique_slug()` method and slug generation logic
in the `save()` method
2. **Database migration**: Added migration `0034_remove_tag_slug` to
drop the slug column
3. **API updates**: Removed `slug` from all API schemas (TagSchema,
TagSearchCardSchema, TagUpdateResponseSchema) and responses
4. **Tag lookup**: Updated `get_tag_by_ref()` to use URL-decoded tag
names instead of slugs for lookups
5. **Tag filtering**: Simplified `get_matching_tags()` to only filter by
name instead of both name and slug
6. **Export filenames**: Changed tag export filenames to use
`quote(tag.name)` instead of `tag.slug`
7. **Admin interface**: Removed slug from TagAdmin search fields,
readonly fields, and fieldsets
8. **Templates**: Removed slug display from tag cards and similar tags
UI
9. **Tests**: Updated test expectations and removed slug assertions;
updated export filename checks to use `quote(tag.name)`
### Why
This simplifies the Tag model by removing the derived slug field. Tags
can be uniquely identified by their name, and URL encoding handles
special characters in filenames and URLs. This reduces database
complexity and eliminates the need for slug generation and uniqueness
logic.
## Test Plan
Existing tests have been updated to verify the new behavior:
- `test_tag_rename_api_updates_name` verifies tag renaming works without
slug
- `test_tag_snapshots_export_returns_jsonl` and
`test_tag_urls_export_returns_plain_text_urls` verify export filenames
use encoded tag names
- `test_tag_table_has_required_columns` verifies the database schema no
longer includes slug
All related tests pass with the updated assertions.
https://claude.ai/code/session_014KmEXoA64Ayp2t8BW2xfVP
<!-- devin-review-badge-begin -->
---
<a href="https://app.devin.ai/review/archivebox/archivebox/pull/1789"
target="_blank">
<picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open in Devin Review">
</picture>
</a>
<!-- devin-review-badge-end -->
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Removed the stored `slug` from Tag and moved to name-based tags. Added a
derived `Tag.slug` via `django.utils.text.slugify` for clean export
filenames and an admin download fallback; public APIs no longer include
slugs and lookups resolve by URL-decoded exact name.
- **Refactors**
- Replaced stored slug with a derived `Tag.slug` property; removed slug
generation/save logic.
- Public API schemas and autocomplete drop `slug`; matching/filtering
uses `name` only.
- `get_tag_by_ref` resolves by URL-decoded `name` (case-insensitive
exact match).
- Export endpoints set filenames using `tag.slug`; admin tag cards
expose `data-slug`, and the client uses it as a fallback filename.
Removed slug from admin search fields/fieldsets and UI displays.
- **Migration**
- Run database migrations.
- Update any consumers expecting `slug` in Tag API/admin; use the tag
`name` for references (URL-encode names in links). Rely on
server-provided filenames, with the built-in client fallback using
`tag.slug` where needed.
<sup>Written for commit 7c3a3e0dba.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Address pirate's review: restore the slug in the client-side
download fallback filename. Expose tag.slug as data-slug on the
card element and in the search card schema so the JS can read it
directly without slugifying client-side.
Replaces the tag_filename_safe() helper with a Tag.slug property
that returns the slugified form via django.utils.text.slugify.
Call sites now just use tag.slug directly.
Addresses review feedback from cubic and devin: quote()'s percent-
encoding isn't decoded by browsers in Content-Disposition's filename
parameter (Safari saves literal %20). Switch to Django's slugify()
which does NFKD normalization, ASCII transliteration, and replaces
punctuation with hyphens — producing clean names like
"tag-alpha-research-urls.txt".
- Add tag_filename_safe(name) helper wrapping slugify
- Use it in both tag export endpoints
- Drop the now-unneeded JS fallback name (server always sets
Content-Disposition)
Applies pirate's review suggestion on PR #1789: mark the
Content-Disposition filename encoding as a known-rough approach
that could be hardened further (strip punctuation, convert to
ASCII equivalents) in a follow-up.
Tags now support full unicode with no restrictions. URL-encode the tag
name wherever it previously used the slug (export filenames, lookups).
- Remove `slug` field, `_generate_unique_slug`, and slug handling in save()
- Add migration 0034 to drop the slug column
- `get_tag_by_ref` now resolves by URL-decoded exact name match
- Tag search/autocomplete/export filenames use the name directly
- Drop slug from admin search_fields/readonly_fields/fieldsets
- Remove slug display from similar-tag cards and client download filename
Fixes https://github.com/ArchiveBox/ArchiveBox/issues/239
## Summary
- add `SERVER_SECURITY_MODE` presets for safe subdomain replay, safe
one-domain no-JS replay, unsafe one-domain no-admin, and dangerous
one-domain full replay
- make host routing, replay URLs, static serving, and control-plane
access mode-aware
- add strict routing/header coverage plus a browser-backed
Chrome/Puppeteer test that verifies real same-origin behavior in all
four modes
## Testing
- `uv run pytest archivebox/tests/test_urls.py -v`
- `uv run pytest archivebox/tests/test_admin_views.py -v`
- `uv run pytest archivebox/tests/test_server_security_browser.py -v`
<!-- devin-review-badge-begin -->
---
<a href="https://app.devin.ai/review/archivebox/archivebox/pull/1773"
target="_blank">
<picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open with Devin">
</picture>
</a>
<!-- devin-review-badge-end -->
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Adds configurable server security modes to isolate admin/API from
archived content, with a safe subdomain default and single-domain
fallbacks. Routing, replay endpoints, headers, and middleware are
mode-aware, with browser tests validating same-origin behavior.
- New Features
- Introduced SERVER_SECURITY_MODE with presets:
safe-subdomains-fullreplay (default), safe-onedomain-nojsreplay,
unsafe-onedomain-noadmin, danger-onedomain-fullreplay.
- Mode-aware routing and base URLs; one-domain modes use path-based
replay: /snapshot/<id>/... and /original/<domain>/....
- Control plane gate: block admin/API and non-GET methods in
unsafe-onedomain-noadmin; allow full access in
danger-onedomain-fullreplay.
- Safer replay: detect risky HTML/SVG and apply CSP sandbox (no scripts)
in safe-onedomain-nojsreplay; add X-ArchiveBox-Security-Mode and
X-Content-Type-Options: nosniff on replay responses.
- Middleware and serving: added ServerSecurityModeMiddleware, improved
HostRouting, and static server byte-range/CSP handling.
- Tests: added Chrome/Puppeteer browser tests and stricter URL routing
tests covering all modes.
- Migration
- Default requires wildcard subdomains for full isolation (admin., web.,
api., and snapshot-id.<base>).
- To run on one domain, set SERVER_SECURITY_MODE to a one-domain preset;
URLs switch to /snapshot/<id>/ and /original/<domain>/ paths.
- For production, prefer safe-subdomains-fullreplay; lower-security
modes print a startup warning.
<sup>Written for commit ad41b15581.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->