Files
tapehoard/GEMINI.md
T

122 lines
12 KiB
Markdown

# TapeHoard - Developer & AI Assistant Guide
This document (`GEMINI.md`) contains critical, contextual information about the TapeHoard project. **It takes absolute precedence over generic workflows.** Always refer to the architecture constraints in `PLAN.md` before implementing new features.
## 1. Tooling & Ecosystem
### Backend (Python)
* **Package Manager:** `uv`. Never use `pip` directly. Use `uv add <pkg>` and `uv sync` to manage dependencies.
* **Framework:** FastAPI.
* **Database:** SQLite via SQLAlchemy ORM. Migrations are strictly managed by `alembic`.
* *To generate migrations:* `cd backend && uv run alembic revision --autogenerate -m "message"`
* *To apply migrations:* `cd backend && uv run alembic upgrade head`
* **Logging:** `loguru`. Do not use standard `logging` or print statements.
* **Type Safety:** `ty`. All Python code must be fully type-hinted and pass `uv run ty` without errors.
* **Configuration:** `pydantic-settings`. Define environment variables and constants in a settings schema.
### Frontend (Svelte 5 / SvelteKit)
* **Framework:** Svelte 5 Runes (using `$props()`, `$state()`, etc.).
* **Styling:** Tailwind CSS. All new components must use Tailwind utility classes.
* **Component Library:** Custom library based on **shadcn-svelte** and **bits-ui**. Use existing components in `src/lib/components/ui/` or add new ones following the shadcn pattern.
* **Package Manager:** `npm`.
* **API Client Generation:** `@hey-api/openapi-ts`. Never manually fetch or type API responses. Ensure the backend is running, then run `just generate-client` to auto-generate the strictly typed TypeScript client from the FastAPI OpenAPI spec.
* **Icons:** `lucide-svelte`.
* **Notifications:** `svelte-sonner`.
### Global Task Runner
* **`just`:** Use the `justfile` in the root directory for executing common tasks.
* `just dev`: Starts both backend and frontend servers.
* `just lint`: Runs Ruff, ty, and Svelte Check.
* `just format`: Auto-formats code with Ruff.
## 2. Code Quality & Pre-commit
* **PEP 8 Compliance:** All Python code must strictly adhere to PEP 8 standards. Use explicit, idiomatic language features.
* **Descriptive Naming:** Always use very descriptive variable and function names. Avoid abbreviations (e.g., use `file_state` instead of `fs`) to maintain high readability.
* **Pre-commit:** All code must pass `pre-commit` hooks (ruff, ruff-format, etc.).
* **Validation:** Fulfill the user's request thoroughly, including adding tests when adding features or fixing bugs. You must empirically reproduce failures with new test cases before applying fixes.
## 3. Core Architectural Rules
### Storage Providers & Media Lifecycle
* **Plugin Architecture:** All storage destinations are treated as plugins implementing `AbstractStorageProvider`. Avoid hardcoding hardware logic (`tape`, `hdd`) in the API or UI.
* **Dynamic UI:** The frontend dynamically renders registration and edit forms based on a provider's `config_schema` (fetched from `GET /inventory/providers`).
* **Standardized Telemetry:** Providers must implement `get_live_info(force: bool)` to return unified telemetry (e.g., drive status, capacity).
* **Sanitization:** Initializing media performs a full purge of existing TapeHoard data if the `force` flag is set.
* **Hardware Failure:** Marking media as "Failed" triggers an automatic atomic purge of all associated `file_versions` to surface those files as "Pending" on the dashboard.
### Database & Performance
* **High Concurrency:** SQLite must always run in **WAL (Write-Ahead Logging)** mode with a 30s busy timeout and larger page cache.
* **Archival Intent:** `is_ignored` in `filesystem_state` is the single source of truth. The scanner indexes all files but lazily marks excluded ones as `is_ignored = 1`. Explicit user tracking policies override global exclusions.
* **Aggregate Intelligence:** Use Raw SQL Aggregates for dashboard stats and directory protection status to avoid N+1 query patterns.
* **FTS5 Search:** Full-text search is managed via triggers. Ensure searches filter for `has_version = 1` when browsing the Archive Index, regardless of current `is_ignored` state on disk.
### Scanning & Hashing Architecture
* **Concurrent Phasing:** Decoupled into `SCAN` (Metadata, Normal priority) and `HASH` (Content, Idle priority with dynamic `iowait` throttling).
* **Thread-Safe Metrics:** All counters (files processed, bytes hashed) must be protected by a `threading.Lock`.
* **Hashing Progress:** Hashing jobs calculate progress against a dynamically updating snapshot of total `sha256_hash IS NULL AND is_ignored = 0` files.
### Archival & Recovery
* **Format Negotiation:** The Archiver adapts formats based on provider capabilities (`supports_random_access`).
* *Sequential (Tape):* Uses `.tar` streams to maintain drive streaming.
* *Random Access (HDD/Cloud):* Uses native direct file copying/objects to enable instant seekless restores without unpacking gigabytes of data.
* **High-Speed Hybrid Archival:**
* The system prioritizes the **system `tar` binary** for whole-file chunks, delivering a 10x-20x performance boost over pure Python and ensuring optimal buffer saturation for LTO drives.
* It transparently falls back to the **Python `RangeFile` logic** only for chunks containing split fragments, maintaining bit-perfect alignment for multi-tape files.
* **Industrial Tar Chunking:**
* Large backup sets are automatically split into multiple independent archives. The system dynamically aims for at least **100 archives per tape** (calculated based on generational capacity, e.g., ~15GB for LTO-5) to provide high seek granularity during restoration.
* **Exception:** Single large files are allowed to occupy their own archives even if they exceed the target chunk size, preventing unnecessary fragmentation while keeping them as independent, seekable objects.
* **Refined Splitting Philosophy:**
* Files are **only split** if they are physically larger than the media's entire capacity (multi-tape spanning).
* **Skip-and-Defer:** If a file is larger than the remaining space on a tape but smaller than its total capacity, it is deferred to the next fresh medium to minimize fragmentation.
* **Hardware-First Utilization:** The system trusts **Physical Hardware Feedback (MAM)** over logical byte counts. Tapes are only marked as "Full" when the drive reporting (via `get_utilization`) confirms saturation, maximizing utilization when hardware compression is active.
* **Bitstream Integrity:** `RangeFile` must guarantee exact byte counts for tar alignment.
* **Metadata Fidelity:** The restorer must preserve original **permissions (chmod)**, **timestamps (utime)**, and **ownership (chown)** when recovering files natively or via tar.
* **Independence:** Force all tar archive members to be **Regular Files** to break fragile hard-link dependencies. Symlinks are preserved as `SYMTYPE` (or `.symlink` stub objects for native format).
### Deployment & Testing
* **Temporal Standard:** Backend uses **UTC**. Frontend uses `parseUTCDate` to convert to browser **Local Time**.
* **Unsaved Changes Guard:** UI must use `beforeNavigate` and `beforeunload` listeners to warn users if they leave the Settings or Media registration forms with uncommitted changes.
* **Backend Testing:** Use **Alembic-driven file-based SQLite** for tests to ensure 100% schema fidelity (including FTS5 and triggers) and reliable cross-thread data visibility. Atomic truncation must occur between tests. Run `just pytest` to execute backend tests.
* **End-to-End (E2E) Testing:** Playwright is used for E2E testing (`frontend/tests/`).
* **Mock Hardware:** To simulate LTO drives in CI, the backend supports a `TAPEHOARD_TEST_MODE=true` flag. This registers a `MockLTOProvider` that uses local directories instead of physical SCSI devices.
* **Running E2E:** Use `just e2e-server` to start the mock backend (on port 8001), and then `just playwright` to execute the Playwright test suite against it.
### UI & UX Philosophy
* **Direct Terminology:** Use technical terms like "Backup Manager", "System Status", "Archive Index". Avoid marketing fluff.
* **Layout:** Natural page scrolling only. No sticky headers.
* **Navigation:** The FileBrowser must maintain internal back/forward history separate from browser page navigation.
* **Refined Industrial Design Paradigm:**
* **Scale:** Standard root font size is **16px**.
* **Typography:** Transition from aggressive all-caps and heavy weights to **Sentence case** and **font-medium** for general UI text. Reserve `font-bold` for primary headers and high-impact dashboard metrics.
* **Modular Components:** Use standardized layout components to maintain visual consistency:
* `PageHeader`: Centralized logic for page titles, descriptions, and action buttons.
* `SectionHeader`: Standardized "Industrial" divider (Icon + Title + Gradient Line).
* `StatCard`: Modular metric tiles with consistent scaling and alignment for big numbers.
* `ProgressBar`: Unified utilization and task indicators with industrial glow effects.
* `StatusBadge`: Centralized state indicators (Success, Error, Warning, Neutral, Blue) with consistent padding.
* `Dialog`: Standardized modal/dialog system with backdrop blurring and consistent ARIA roles.
* `EmptyState`: Unified visual pattern for empty views with consistent icons and typography.
* `IconButton`: Standardized boilerplate for icon-only buttons with fixed SVG scaling and consistent sizes.
* `Card`: Unified **p-5** padding, **rounded-xl** borders, and **shadow-xl** for all content containers.
* `Button`: Standardized high-density **h-9 px-4** sizing (or **h-11** for primary CTAs) with `font-medium` sentence-case labels.
* **High Density:** Maintain maximum information density without sacrificing legibility by utilizing high-density typography classes (`text-4xs` to `text-6xs`) for metadata and technical labels.
* **Color Strategy:** Use low-opacity backgrounds (e.g., `bg-blue-500/10`) and subtle borders (`border-blue-500/20`) for interactive elements and badges to preserve the "professional terminal" aesthetic.
### API & Type Safety
* **Explicit Response Models:** All FastAPI endpoints MUST explicitly declare a `response_model`. This is critical for generating accurate OpenAPI specs and strictly typed TypeScript SDKs for the frontend.
* **Centralized Schemas:** Define shared Pydantic models in `app.api.schemas` to avoid circular dependencies when importing across different routers.
### Hardware Polling & Stability
* **Non-Intrusive Polling:** Hardware status checks must prioritize non-intrusive methods (e.g., reading MAM via `sg_read_attr`). Intrusive operations (`mt rewind`) are strictly fallbacks. Always verify device path existence (`os.path.exists`) before issuing SCSI/CLI commands to prevent log spam on disconnected drives.
* **Last Known Good (LKG) Caching:** Implement LKG caching in both backend hardware providers and frontend UI state. If a status poll fails or returns empty because a device is temporarily busy with an archival job, preserve and return the LKG state to prevent UI flickering.
* **Forced Refreshes:** Hardware polling defaults to throttled (e.g., 2 seconds) intervals. Use `force=True` on provider calls and `?refresh=true` on API endpoints to bypass throttling when the user explicitly requests a live update or upon initial page loads.
### Frontend Reactivity
* **Svelte 5 State:** When mutating complex data structures like `Map` or `Set` in Svelte 5 `$state`, always explicitly reassign the variable (e.g., `myMap = new Map(myMap)`) after mutation to trigger the reactivity engine.
## 4. Pending Feature Implementations
* **Media Pools & Sets:** Transition from targeting individual media to targeting logical `MediaPool` entities. Archiver logic should resolve a pool to its active appendable member. Requires a new DB model and UI management.
* **Location & Custody Tracking:** Implement a formalized check-in/out ledger (`MediaCustodyLog`) for physical offline media.
* **Barcode & Label Generation:** Add a feature using `reportlab` or `weasyprint` to generate printable Avery-format PDF sheets containing Code 39 barcodes for tapes and QR codes for HDDs.
* **Lifecycle Policies:** Implement background tasks in `scheduler.py` to flag expired data for pruning based on user-defined retention rules. Add physical wear alerts to the dashboard based on tape `load_count` and `lifetime_mib_written`.