12 KiB
12 KiB
TapeHoard - Developer & AI Assistant Guide
This document (GEMINI.md) contains critical, contextual information about the TapeHoard project. It takes absolute precedence over generic workflows. Always refer to the architecture constraints in PLAN.md before implementing new features.
1. Tooling & Ecosystem
Backend (Python)
- Package Manager:
uv. Never usepipdirectly. Useuv add <pkg>anduv syncto manage dependencies. - Framework: FastAPI.
- Database: SQLite via SQLAlchemy ORM. Migrations are strictly managed by
alembic.- To generate migrations:
cd backend && uv run alembic revision --autogenerate -m "message" - To apply migrations:
cd backend && uv run alembic upgrade head
- To generate migrations:
- Logging:
loguru. Do not use standardloggingor print statements. - Type Safety:
ty. All Python code must be fully type-hinted and passuv run tywithout errors. - Configuration:
pydantic-settings. Define environment variables and constants in a settings schema.
Frontend (Svelte 5 / SvelteKit)
- Framework: Svelte 5 Runes (using
$props(),$state(), etc.). - Styling: Tailwind CSS. All new components must use Tailwind utility classes.
- Component Library: Custom library based on shadcn-svelte and bits-ui. Use existing components in
src/lib/components/ui/or add new ones following the shadcn pattern. - Package Manager:
npm. - API Client Generation:
@hey-api/openapi-ts. Never manually fetch or type API responses. Ensure the backend is running, then runjust generate-clientto auto-generate the strictly typed TypeScript client from the FastAPI OpenAPI spec. - Icons:
lucide-svelte. - Notifications:
svelte-sonner.
Global Task Runner
just: Use thejustfilein the root directory for executing common tasks.just dev: Starts both backend and frontend servers.just lint: Runs Ruff, ty, and Svelte Check.just format: Auto-formats code with Ruff.
2. Code Quality & Pre-commit
- PEP 8 Compliance: All Python code must strictly adhere to PEP 8 standards. Use explicit, idiomatic language features.
- Descriptive Naming: Always use very descriptive variable and function names. Avoid abbreviations (e.g., use
file_stateinstead offs) to maintain high readability. - Pre-commit: All code must pass
pre-commithooks (ruff, ruff-format, etc.). - Validation: Fulfill the user's request thoroughly, including adding tests when adding features or fixing bugs. You must empirically reproduce failures with new test cases before applying fixes.
3. Core Architectural Rules
Storage Providers & Media Lifecycle
- Plugin Architecture: All storage destinations are treated as plugins implementing
AbstractStorageProvider. Avoid hardcoding hardware logic (tape,hdd) in the API or UI. - Dynamic UI: The frontend dynamically renders registration and edit forms based on a provider's
config_schema(fetched fromGET /inventory/providers). - Standardized Telemetry: Providers must implement
get_live_info(force: bool)to return unified telemetry (e.g., drive status, capacity). - Sanitization: Initializing media performs a full purge of existing TapeHoard data if the
forceflag is set. - Hardware Failure: Marking media as "Failed" triggers an automatic atomic purge of all associated
file_versionsto surface those files as "Pending" on the dashboard.
Database & Performance
- High Concurrency: SQLite must always run in WAL (Write-Ahead Logging) mode with a 30s busy timeout and larger page cache.
- Archival Intent:
is_ignoredinfilesystem_stateis the single source of truth. The scanner indexes all files but lazily marks excluded ones asis_ignored = 1. Explicit user tracking policies override global exclusions. - Aggregate Intelligence: Use Raw SQL Aggregates for dashboard stats and directory protection status to avoid N+1 query patterns.
- FTS5 Search: Full-text search is managed via triggers. Ensure searches filter for
has_version = 1when browsing the Archive Index, regardless of currentis_ignoredstate on disk.
Scanning & Hashing Architecture
- Concurrent Phasing: Decoupled into
SCAN(Metadata, Normal priority) andHASH(Content, Idle priority with dynamiciowaitthrottling). - Thread-Safe Metrics: All counters (files processed, bytes hashed) must be protected by a
threading.Lock. - Hashing Progress: Hashing jobs calculate progress against a dynamically updating snapshot of total
sha256_hash IS NULL AND is_ignored = 0files.
Archival & Recovery
- Format Negotiation: The Archiver adapts formats based on provider capabilities (
supports_random_access).- Sequential (Tape): Uses
.tarstreams to maintain drive streaming. - Random Access (HDD/Cloud): Uses native direct file copying/objects to enable instant seekless restores without unpacking gigabytes of data.
- Sequential (Tape): Uses
- High-Speed Hybrid Archival:
- The system prioritizes the system
tarbinary for whole-file chunks, delivering a 10x-20x performance boost over pure Python and ensuring optimal buffer saturation for LTO drives. - It transparently falls back to the Python
RangeFilelogic only for chunks containing split fragments, maintaining bit-perfect alignment for multi-tape files.
- The system prioritizes the system
- Industrial Tar Chunking:
- Large backup sets are automatically split into multiple independent archives. The system dynamically aims for at least 100 archives per tape (calculated based on generational capacity, e.g., ~15GB for LTO-5) to provide high seek granularity during restoration.
- Exception: Single large files are allowed to occupy their own archives even if they exceed the target chunk size, preventing unnecessary fragmentation while keeping them as independent, seekable objects.
- Refined Splitting Philosophy:
- Files are only split if they are physically larger than the media's entire capacity (multi-tape spanning).
- Skip-and-Defer: If a file is larger than the remaining space on a tape but smaller than its total capacity, it is deferred to the next fresh medium to minimize fragmentation.
- Hardware-First Utilization: The system trusts Physical Hardware Feedback (MAM) over logical byte counts. Tapes are only marked as "Full" when the drive reporting (via
get_utilization) confirms saturation, maximizing utilization when hardware compression is active. - Bitstream Integrity:
RangeFilemust guarantee exact byte counts for tar alignment. - Metadata Fidelity: The restorer must preserve original permissions (chmod), timestamps (utime), and ownership (chown) when recovering files natively or via tar.
- Independence: Force all tar archive members to be Regular Files to break fragile hard-link dependencies. Symlinks are preserved as
SYMTYPE(or.symlinkstub objects for native format).
Deployment & Testing
- Temporal Standard: Backend uses UTC. Frontend uses
parseUTCDateto convert to browser Local Time. - Unsaved Changes Guard: UI must use
beforeNavigateandbeforeunloadlisteners to warn users if they leave the Settings or Media registration forms with uncommitted changes. - Backend Testing: Use Alembic-driven file-based SQLite for tests to ensure 100% schema fidelity (including FTS5 and triggers) and reliable cross-thread data visibility. Atomic truncation must occur between tests. Run
just pytestto execute backend tests. - End-to-End (E2E) Testing: Playwright is used for E2E testing (
frontend/tests/).- Mock Hardware: To simulate LTO drives in CI, the backend supports a
TAPEHOARD_TEST_MODE=trueflag. This registers aMockLTOProviderthat uses local directories instead of physical SCSI devices. - Running E2E: Use
just e2e-serverto start the mock backend (on port 8001), and thenjust playwrightto execute the Playwright test suite against it.
- Mock Hardware: To simulate LTO drives in CI, the backend supports a
UI & UX Philosophy
- Direct Terminology: Use technical terms like "Backup Manager", "System Status", "Archive Index". Avoid marketing fluff.
- Layout: Natural page scrolling only. No sticky headers.
- Navigation: The FileBrowser must maintain internal back/forward history separate from browser page navigation.
- Refined Industrial Design Paradigm:
- Scale: Standard root font size is 16px.
- Typography: Transition from aggressive all-caps and heavy weights to Sentence case and font-medium for general UI text. Reserve
font-boldfor primary headers and high-impact dashboard metrics. - Modular Components: Use standardized layout components to maintain visual consistency:
PageHeader: Centralized logic for page titles, descriptions, and action buttons.SectionHeader: Standardized "Industrial" divider (Icon + Title + Gradient Line).StatCard: Modular metric tiles with consistent scaling and alignment for big numbers.ProgressBar: Unified utilization and task indicators with industrial glow effects.StatusBadge: Centralized state indicators (Success, Error, Warning, Neutral, Blue) with consistent padding.Dialog: Standardized modal/dialog system with backdrop blurring and consistent ARIA roles.EmptyState: Unified visual pattern for empty views with consistent icons and typography.IconButton: Standardized boilerplate for icon-only buttons with fixed SVG scaling and consistent sizes.Card: Unified p-5 padding, rounded-xl borders, and shadow-xl for all content containers.Button: Standardized high-density h-9 px-4 sizing (or h-11 for primary CTAs) withfont-mediumsentence-case labels.
- High Density: Maintain maximum information density without sacrificing legibility by utilizing high-density typography classes (
text-4xstotext-6xs) for metadata and technical labels. - Color Strategy: Use low-opacity backgrounds (e.g.,
bg-blue-500/10) and subtle borders (border-blue-500/20) for interactive elements and badges to preserve the "professional terminal" aesthetic.
API & Type Safety
- Explicit Response Models: All FastAPI endpoints MUST explicitly declare a
response_model. This is critical for generating accurate OpenAPI specs and strictly typed TypeScript SDKs for the frontend. - Centralized Schemas: Define shared Pydantic models in
app.api.schemasto avoid circular dependencies when importing across different routers.
Hardware Polling & Stability
- Non-Intrusive Polling: Hardware status checks must prioritize non-intrusive methods (e.g., reading MAM via
sg_read_attr). Intrusive operations (mt rewind) are strictly fallbacks. Always verify device path existence (os.path.exists) before issuing SCSI/CLI commands to prevent log spam on disconnected drives. - Last Known Good (LKG) Caching: Implement LKG caching in both backend hardware providers and frontend UI state. If a status poll fails or returns empty because a device is temporarily busy with an archival job, preserve and return the LKG state to prevent UI flickering.
- Forced Refreshes: Hardware polling defaults to throttled (e.g., 2 seconds) intervals. Use
force=Trueon provider calls and?refresh=trueon API endpoints to bypass throttling when the user explicitly requests a live update or upon initial page loads.
Frontend Reactivity
- Svelte 5 State: When mutating complex data structures like
MaporSetin Svelte 5$state, always explicitly reassign the variable (e.g.,myMap = new Map(myMap)) after mutation to trigger the reactivity engine.
4. Pending Feature Implementations
- Media Pools & Sets: Transition from targeting individual media to targeting logical
MediaPoolentities. Archiver logic should resolve a pool to its active appendable member. Requires a new DB model and UI management. - Location & Custody Tracking: Implement a formalized check-in/out ledger (
MediaCustodyLog) for physical offline media. - Barcode & Label Generation: Add a feature using
reportlaborweasyprintto generate printable Avery-format PDF sheets containing Code 39 barcodes for tapes and QR codes for HDDs. - Lifecycle Policies: Implement background tasks in
scheduler.pyto flag expired data for pruning based on user-defined retention rules. Add physical wear alerts to the dashboard based on tapeload_countandlifetime_mib_written.