commit backup progress per chunk

more readme notes
not JUST tape
2026-05-06 11:26:32 -04:00 · 2026-05-05 23:47:45 -04:00 · 2026-05-05 23:39:58 -04:00 · 2026-05-05 23:33:26 -04:00 · 2026-05-05 22:07:30 -04:00 · 2026-05-05 21:33:44 -04:00
26 changed files with 1974 additions and 923 deletions
@@ -1,57 +1,126 @@
 # TapeHoard

-A robust, index-driven Tape Backup Manager designed for single-tape drive users and scalable to tape libraries.
+> Physical media archival for people who don't trust the cloud alone.

-For full architectural details, see [PLAN.md](PLAN.md).
+**TapeHoard is not just for tapes.** It's a self-hosted backup manager for any offline-capable storage you already own:

-## Docker Deployment
+- **Offline HDDs / USB drives** — Any mountable filesystem (ext4, NTFS, APFS, exFAT)
+- **S3-compatible cloud** — Encrypted copies on Wasabi, Backblaze B2, MinIO, or any S3-compat provider
+- **LTO tape** — If you happen to own a tape drive like some of us do

-TapeHoard is designed to run as a Docker container with native hardware access.
+It indexes your source filesystems, tracks what has been archived to which medium, and gives you a searchable catalog—even when the media itself is sitting in a vault across town.

-### Permissions (PUID/PGID)
-The container supports `PUID` and `PGID` environment variables to ensure files written to volumes match your host user's identity.
+![Dashboard](docs/screenshots/dashboard.png)

-**Critical:** To ensure fast startup times, TapeHoard **does not** perform a recursive `chown` on your data. You must ensure your host directories are owned by the same PUID/PGID you provide to the container:
+## Features

-```bash
-# Example: If PUID=1000 and PGID=1000
-sudo chown -R 1000:1000 ./db ./staging ./source_data ./restores
-```
+| Feature | Description |
+|---|---|
+| **Index-First Design** | Browse, search, and check discrepancies against the database. Live filesystem is only touched during scans. |
+| **Multi-Media Fleet** | Manage HDDs, USB drives, S3-compatible cloud, and LTO tape in one inventory. |
+| **Ordered Auto-Archival** | Drag media to set fill priority. Backups flow to the first available medium in your sequence. |
+| **LTO Tape Native** | Barcode discovery via MAM, hardware compression control, direct SCSI streaming to tape. |
+| **Restore Queue** | Stage files for recovery. Get a minimum-media manifest so you only mount what you need. |
+| **Discrepancy Detection** | Find missing files, changes without backup, or policy exclusions. |
+| **Encrypted at Rest** | Per-media encryption secrets in a built-in keystore. Compatible with LTO hardware encryption (`stenc`) and client-side cloud encryption. |
+| **Scheduled Scans** | Cron-like scheduling for automatic filesystem discovery and hashing. |
+| **Exclusion Policies** | Global gitignore-style patterns to skip caches, build artifacts, and temp files. |

-### Example `docker-compose.yml`
+## Screenshots
+
+| Dashboard | Media Inventory |
+|---|---|
+| ![Dashboard](docs/screenshots/dashboard.png) | ![Inventory](docs/screenshots/inventory.png) |
+
+| Live Filesystem | System Settings |
+|---|---|
+| ![Filesystem](docs/screenshots/filesystem.png) | ![Settings](docs/screenshots/settings.png) |
+
+## Quick Start (Docker Compose)
+
+The recommended deployment is a single container with persistent volumes for the database, staging area, and source/restore mounts.

 ```yaml
 services:
  tapehoard:
-    image: tapehoard:latest
+    image: ghcr.io/tapehoard/tapehoard:latest
    container_name: tapehoard
-    environment:
-      - PUID=1000
-      - PGID=1000
-      - TZ=UTC
-    volumes:
-      - ./db:/database
-      - ./staging:/staging
-      - /mnt/my_data:/source_data:ro
-      - /mnt/restores:/restores
-      # LTO Tape Drive Passthrough
-      - /dev/nst0:/dev/nst0
-      - /dev/sgX:/dev/sgX
+    cap_add:
+      - SYS_RAWIO
    devices:
      - /dev/nst0:/dev/nst0
-      - /dev/sgX:/dev/sgX
+    environment:
+      - TZ=UTC
+      - DATABASE_URL=sqlite:////database/tapehoard.db
+      - STAGING_DIRECTORY=/staging
    ports:
-      - "8000:8000"
-    restart: unless-stopped
+      - '30265:8000'
+    volumes:
+      - ./database:/database
+      - ./staging:/staging
+      - /mnt/archive:/source_data:ro
+      - /mnt/restores:/restores
 ```

-## Project Structure
+### Requirements

-*   `backend/`: Python/FastAPI application handling the heavy lifting (hashing, streaming, db indexing).
-*   `frontend/`: Svelte 5 application providing the Web UI.
-*   `docker/`: Files required for building the multi-stage Docker container.
-*   `docs/`: Additional documentation.
+- **Linux host** (LTO tape support requires `mt`, `sg_read_attr`, and optionally `stenc` on the host or in the container)
+- **Persistent volumes** — Database and staging must survive container restarts

-## Quickstart
+> **No tape drive?** Remove the `cap_add`, `devices`, and `SYS_RAWIO` lines from the compose file above. TapeHoard works great with just HDDs, USB drives, or cloud storage.

-(Coming soon)
+### Hardware-Specific Notes
+
+**HDDs / USB Drives (Recommended for most users):**
+- Mount the drive filesystem into the container at `/source_data` or a restore destination
+- The HDD provider reads a `.tapehoard_id` file on the drive root to identify media
+- No special capabilities required — works on any Linux, macOS, or Docker host
+
+**S3-Compatible Cloud:**
+- Configure endpoint URL, bucket, region, and access credentials in settings
+- Optional client-side filename obfuscation and encryption
+
+**LTO Tape (For the dedicated):**
+- The container must run as root or have access to the SCSI device node
+- Requires `SYS_RAWIO` capability for direct SCSI access
+- Set `TAPEHOARD_TEST_MODE=true` to enable a mock LTO provider for development without hardware
+
+### First Run
+
+1. Start the container: `docker compose up -d`
+2. Open `http://host:30265`
+3. Go to **Settings → Drives** and configure your source roots (and tape drive path, if applicable)
+4. Trigger an initial scan from the dashboard
+5. Register media under **Physical Inventory**
+6. Run your first backup
+
+## Development
+
+TapeHoard uses [`just`](https://github.com/casey/just) as its command runner. Install it (`brew install just` or `cargo install just`), then run `just` to see all available commands.
+
+### Common Tasks
+
+```bash
+just dev          # Start backend + frontend with hot reload
+just backend      # Start only the backend
+just frontend     # Start only the frontend
+just test         # Run linting, backend tests, and E2E tests
+just lint         # Run Ruff, ty, and Svelte checks
+just format       # Auto-format Python code
+just generate-client   # Regenerate TypeScript SDK from OpenAPI spec
+```
+
+### Database Migrations
+
+```bash
+just db-upgrade                    # Apply pending migrations
+just db-migrate "add user table"   # Autogenerate a new migration
+```
+
+## Why TapeHoard?
+
+Most backup tools are built for always-online replication. TapeHoard is built for media you can unplug:
+
+- **Air-gappable** — Pull the drive or tape, store it offline. Your index stays searchable even when the media is in a vault.
+- **Auditability** — Every file's SHA-256, every version's offset on every medium, tracked in SQLite.
+- **No vendor lock-in** — Standard tar archives on tape, standard files on disk, standard S3 objects in cloud. If TapeHoard disappears, your data doesn't.
@@ -56,7 +56,7 @@ def get_exclusion_spec(db_session: Session) -> Optional[pathspec.PathSpec]:
        for pattern in settings_record.value.splitlines()
        if pattern.strip()
    ]
-    return pathspec.PathSpec.from_lines("gitwildmatch", exclusion_patterns)
+    return pathspec.PathSpec.from_lines("gitignore", exclusion_patterns)


 def get_ignored_status(
@@ -159,6 +159,13 @@ class DashboardStatsSchema(BaseModel):
    redundancy_ratio: float


+class StagingInfoSchema(BaseModel):
+    path: str
+    total_bytes: int
+    used_bytes: int
+    free_bytes: int
+
+
 class JobSchema(BaseModel):
    model_config = ConfigDict(from_attributes=True)

@@ -1,7 +1,10 @@
+import shutil
+
 from fastapi import APIRouter, Depends
 from sqlalchemy.orm import Session
 from app.db.database import get_db
-from app.api.common import DashboardStatsSchema
+from app.api.common import DashboardStatsSchema, StagingInfoSchema
+from app.core.config import settings
 from sqlalchemy import func, text
 from app.db import models

@@ -113,3 +116,37 @@ def get_dashboard_stats(db_session: Session = Depends(get_db)):
        last_scan_time=last_scan.completed_at if last_scan else None,
        redundancy_ratio=round(redundancy_percentage, 1),
    )
+
+
+@router.get(
+    "/staging/info", response_model=StagingInfoSchema, operation_id="get_staging_info"
+)
+def get_staging_info():
+    """Returns disk usage information for the backup staging directory."""
+    path = settings.staging_directory
+    try:
+        usage = shutil.disk_usage(path)
+        return StagingInfoSchema(
+            path=path,
+            total_bytes=usage.total,
+            used_bytes=usage.used,
+            free_bytes=usage.free,
+        )
+    except OSError:
+        # Fallback: if the configured path doesn't exist yet, check its parent
+        parent = path if path == "/" else path.rsplit("/", 1)[0] or "/"
+        try:
+            usage = shutil.disk_usage(parent)
+            return StagingInfoSchema(
+                path=path,
+                total_bytes=usage.total,
+                used_bytes=usage.used,
+                free_bytes=usage.free,
+            )
+        except OSError:
+            return StagingInfoSchema(
+                path=path,
+                total_bytes=0,
+                used_bytes=0,
+                free_bytes=0,
+            )
@@ -127,7 +127,7 @@ def test_exclusions(
            total_files=0, total_size=0, matched_count=0, matched_size=0, sample=[]
        )

-    spec = pathspec.PathSpec.from_lines("gitwildmatch", patterns)
+    spec = pathspec.PathSpec.from_lines("gitignore", patterns)

    all_files = (
        db_session.query(models.FilesystemState)
@@ -179,7 +179,7 @@ def download_exclusion_report(
    if not patterns:
        raise HTTPException(status_code=400, detail="No patterns provided")

-    spec = pathspec.PathSpec.from_lines("gitwildmatch", patterns)
+    spec = pathspec.PathSpec.from_lines("gitignore", patterns)

    all_files = (
        db_session.query(models.FilesystemState)
@@ -4,6 +4,54 @@ import sys
 from loguru import logger


+def _get_ionice_setting() -> str:
+    """Reads the user's preferred I/O scheduling class from settings."""
+    try:
+        from app.db.database import SessionLocal
+        from app.db import models
+
+        with SessionLocal() as db_session:
+            record = (
+                db_session.query(models.SystemSetting)
+                .filter(models.SystemSetting.key == "ionice_level")
+                .first()
+            )
+            if record and record.value in ("idle", "best-effort", "realtime"):
+                return record.value
+    except Exception:
+        pass
+    return "idle"  # Default: be the most polite
+
+
+def set_process_priority(level: str):
+    """Adjusts CPU and I/O priority of the current process.
+
+    Args:
+        level: "background" for lowest priority (ionice idle + nice 19),
+               "normal" to reset (ionice best-effort + nice 0).
+    """
+    try:
+        import psutil
+
+        p = psutil.Process(os.getpid())
+        if level == "background":
+            ionice_level = _get_ionice_setting()
+            if hasattr(p, "ionice"):
+                if ionice_level == "idle":
+                    p.ionice(psutil.IOPRIO_CLASS_IDLE)  # type: ignore[attr-defined]
+                elif ionice_level == "realtime":
+                    p.ionice(psutil.IOPRIO_CLASS_RT)  # type: ignore[attr-defined]
+                else:
+                    p.ionice(psutil.IOPRIO_CLASS_BE)  # type: ignore[attr-defined]
+            p.nice(19)
+        else:
+            if hasattr(p, "ionice"):
+                p.ionice(psutil.IOPRIO_CLASS_BE)  # type: ignore[attr-defined]
+            p.nice(0)
+    except Exception as e:
+        logger.debug(f"Could not set process priority to '{level}': {e}")
+
+
 def get_path_uuid(path: str) -> str | None:
    """Attempts to retrieve a stable hardware/filesystem UUID for a given path."""
    if not os.path.exists(path):
@@ -10,7 +10,7 @@ from loguru import logger

 class LTOProvider(AbstractStorageProvider):
    provider_id = "lto_tape"
-    name = "LTO Tape Drive"
+    name = "LTO Tape"
    description = "Hardware Linear Tape-Open (LTO) drives."
    capabilities = {
        "supports_random_access": False,
@@ -70,7 +70,8 @@ class LTOProvider(AbstractStorageProvider):
                "drive": {},
                "mam": {},
                "online": False,
-                "last_check": 0.0,
+                "last_online_check": 0.0,
+                "last_mam_check": 0.0,
            }

    def _log_command(self, cmd: List[str]):
@@ -116,7 +117,8 @@ class LTOProvider(AbstractStorageProvider):
        # Throttle MAM reads to once every 2 seconds unless forced
        now = time.time()
        if not force and (
-            now - LTOProvider._lkg_state[self.device_path].get("last_check", 0) < 2.0
+            now - LTOProvider._lkg_state[self.device_path].get("last_mam_check", 0)
+            < 2.0
        ):
            return LTOProvider._lkg_state[self.device_path]["mam"]

@@ -233,22 +235,33 @@ class LTOProvider(AbstractStorageProvider):

                    # SUCCESS! Update LKG MAM state
                    LTOProvider._lkg_state[self.device_path]["mam"] = mam
-                    LTOProvider._lkg_state[self.device_path]["last_check"] = time.time()
+                    LTOProvider._lkg_state[self.device_path]["last_mam_check"] = (
+                        time.time()
+                    )
                    return mam

-                # If we get "Device or resource busy", wait a bit and retry
+                # Log failure so we can diagnose why sg_read_attr isn't working
                stderr_text = (
-                    (result.stderr or b"").decode().lower()
+                    (result.stderr or b"").decode()
                    if isinstance(result.stderr, bytes)
-                    else (result.stderr or "").lower()
+                    else (result.stderr or "")
                )
-                if result.returncode != 0 and "busy" in stderr_text:
-                    time.sleep(0.2)
-                    continue
+                if result.returncode != 0:
+                    logger.warning(
+                        f"sg_read_attr returned code {result.returncode} for {self.device_path} (attempt {attempt + 1}/3): {stderr_text[:200]}"
+                    )
+                    if "busy" in stderr_text.lower():
+                        time.sleep(0.2)
+                        continue

+            except FileNotFoundError:
+                logger.error(
+                    f"'sg_read_attr' binary not found in PATH. Cannot read MAM for {self.device_path}."
+                )
+                break
            except Exception as e:
-                logger.debug(
-                    f"MAM read attempt {attempt} failed for {self.device_path}: {e}"
+                logger.warning(
+                    f"MAM read attempt {attempt + 1}/3 failed for {self.device_path}: {e}"
                )
                time.sleep(0.1)

@@ -296,11 +309,15 @@ class LTOProvider(AbstractStorageProvider):
        now = time.time()
        if (
            not force
-            and now - LTOProvider._lkg_state[self.device_path].get("last_check", 0)
+            and now
+            - LTOProvider._lkg_state[self.device_path].get("last_online_check", 0)
            < 2.0
        ):
            return LTOProvider._lkg_state[self.device_path]["online"]

+        is_online = False
+
+        # 1. Try mt status
        try:
            cmd = ["mt", "-f", self.device_path, "status"]
            self._log_command(cmd)
@@ -314,22 +331,36 @@ class LTOProvider(AbstractStorageProvider):
                "Device or resource busy" in stderr
                or "Device or resource busy" in stdout
            ):
-                LTOProvider._lkg_state[self.device_path]["online"] = True
-                return True
+                is_online = True
+            else:
+                is_online = (
+                    "ONLINE" in stdout or "READY" in stdout or result.returncode == 0
+                )
+        except FileNotFoundError:
+            logger.debug(f"'mt' binary not found for {self.device_path}")
+        except Exception as e:
+            logger.debug(f"mt status failed for {self.device_path}: {e}")

-            is_online = (
-                "ONLINE" in stdout or "READY" in stdout or result.returncode == 0
-            )
+        # 2. Fallback: try sg_turs (SCSI Test Unit Ready)
+        if not is_online:
+            try:
+                cmd = ["sg_turs", self.device_path]
+                self._log_command(cmd)
+                result = subprocess.run(cmd, capture_output=True, timeout=5)
+                if result.returncode == 0:
+                    is_online = True
+            except FileNotFoundError:
+                logger.debug(f"'sg_turs' binary not found for {self.device_path}")
+            except Exception as e:
+                logger.debug(f"sg_turs failed for {self.device_path}: {e}")

-            # If we transitioned from online -> offline, clear the LKG MAM (tape was likely ejected)
-            if LTOProvider._lkg_state[self.device_path]["online"] and not is_online:
-                LTOProvider._lkg_state[self.device_path]["mam"] = {}
+        # 3. If we transitioned from online -> offline, clear the LKG MAM (tape was likely ejected)
+        if LTOProvider._lkg_state[self.device_path]["online"] and not is_online:
+            LTOProvider._lkg_state[self.device_path]["mam"] = {}

-            LTOProvider._lkg_state[self.device_path]["online"] = is_online
-            LTOProvider._lkg_state[self.device_path]["last_check"] = now
-            return is_online
-        except Exception:
-            return LTOProvider._lkg_state[self.device_path]["online"]
+        LTOProvider._lkg_state[self.device_path]["online"] = is_online
+        LTOProvider._lkg_state[self.device_path]["last_online_check"] = now
+        return is_online

    def is_write_protected(self) -> bool:
        """Checks if the tape is write-protected (read-only)"""
@@ -307,6 +307,10 @@ class ArchiverService:
        )
        JobManager.add_job_log(job_id, f"Starting backup to {media_record.identifier}")

+        from app.core.utils import set_process_priority
+
+        set_process_priority("background")
+
        workload_batch = self.assemble_backup_batch(db_session, media_id)
        if not workload_batch:
            JobManager.add_job_log(job_id, "No files require backup")
@@ -374,6 +378,31 @@ class ArchiverService:
            if current_chunk:
                chunks.append(current_chunk)

+            # --- Staging Space Validation ---
+            # Sequential media (tape) requires staging the full tarfile before writing.
+            # Ensure the staging directory has enough free space for the largest chunk.
+            if not storage_provider.capabilities.get("supports_random_access"):
+                largest_chunk_size = max(
+                    sum(i["offset_end"] - i["offset_start"] for i in chunk)
+                    for chunk in chunks
+                )
+                try:
+                    usage = shutil.disk_usage(self.staging_directory)
+                    # Require 110% of chunk size to leave headroom for tar overhead
+                    required = int(largest_chunk_size * 1.1)
+                    if usage.free < required:
+                        free_gb = usage.free / (1024**3)
+                        req_gb = required / (1024**3)
+                        JobManager.fail_job(
+                            job_id,
+                            f"Staging area at {self.staging_directory} has only {free_gb:.1f} GB free, "
+                            f"but the largest archive chunk requires {req_gb:.1f} GB. "
+                            f"Free up space or reduce the backup set.",
+                        )
+                        return
+                except OSError as e:
+                    logger.warning(f"Could not check staging disk usage: {e}")
+
            JobManager.add_job_log(job_id, f"Packed into {len(chunks)} archive(s)")

            for chunk_index, chunk_items in enumerate(chunks):
@@ -439,6 +468,19 @@ class ArchiverService:
                        remaining_to_write.append(item)

                if not remaining_to_write:
+                    # Checkpoint: all files were duplicates; commit deduplication
+                    # records so they aren't lost if the job fails later.
+                    try:
+                        db_session.commit()
+                        JobManager.add_job_log(
+                            job_id,
+                            f"Checkpoint: chunk {chunk_num} deduplicated",
+                        )
+                    except StaleDataError:
+                        db_session.rollback()
+                        logger.warning(
+                            f"Checkpoint commit failed for deduplicated chunk {chunk_num}"
+                        )
                    continue

                if storage_provider.capabilities.get("supports_random_access"):
@@ -503,6 +545,20 @@ class ArchiverService:
                                    offset_end=item["offset_end"],
                                )
                            )
+
+                    # Checkpoint: commit after each successful chunk
+                    try:
+                        db_session.commit()
+                        JobManager.add_job_log(
+                            job_id,
+                            f"Checkpoint: chunk {chunk_num} committed",
+                        )
+                    except StaleDataError:
+                        db_session.rollback()
+                        logger.warning(
+                            f"Checkpoint commit failed for chunk {chunk_num}"
+                        )
+
                else:
                    # Sequential Media (Tape): Hybrid Tar Generation
                    has_splits = any(item["is_split"] for item in remaining_to_write)
@@ -642,6 +698,19 @@ class ArchiverService:
                    if os.path.exists(staging_full_path):
                        os.remove(staging_full_path)

+                    # Checkpoint: commit after each successful archive
+                    try:
+                        db_session.commit()
+                        JobManager.add_job_log(
+                            job_id,
+                            f"Checkpoint: archive {chunk_num} committed",
+                        )
+                    except StaleDataError:
+                        db_session.rollback()
+                        logger.warning(
+                            f"Checkpoint commit failed for archive {chunk_num}"
+                        )
+
            # --- Saturated Media Logic ---
            # If utilized over 98%, mark as full and cede priority
            # First, try to get actual hardware utilization (trust hardware MAM over our byte counts)
@@ -719,6 +788,7 @@ class ArchiverService:
            logger.exception(f"Archival failed: {e}")
            JobManager.fail_job(job_id, str(e))
        finally:
+            set_process_priority("normal")
            # Clean up any residual staging files
            for chunk_file in os.listdir(self.staging_directory):
                if chunk_file.startswith("backup_") and chunk_file.endswith(".tar"):
@@ -1,12 +1,10 @@
 import concurrent.futures
 import hashlib
 import os
-import shutil
-import subprocess
 import threading
 import time
 from datetime import datetime, timezone
-from typing import Any, Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional

 import psutil
 from loguru import logger
@@ -16,278 +14,6 @@ from sqlalchemy.orm.exc import ObjectDeletedError, StaleDataError
 from app.db import models
 from app.db.database import SessionLocal

-# Fast file discovery via `find -printf` (GNU find or compatible).
-# Detected once at import time; falls back to os.walk if unavailable.
-_FAST_FIND_BINARY: Optional[str] = None
-
-# Fast hashing via `sha256sum` or `shasum`.
-# Detected once at import time; falls back to Python hashlib if unavailable.
-_FAST_HASH_BINARY: Optional[str] = None
-
-
-def _detect_fast_find() -> Optional[str]:
-    """Check if a `find` binary with `-printf` support is available.
-
-    Tries `gfind` (GNU find via Homebrew on macOS) first, then `find`.
-    Returns the binary path if `-printf` works, otherwise ``None``.
-    """
-    for candidate in ("gfind", "find"):
-        binary = shutil.which(candidate)
-        if binary is None:
-            continue
-        try:
-            result = subprocess.run(
-                [binary, "/tmp", "-maxdepth", "0", "-printf", "%f\n"],
-                capture_output=True,
-                timeout=5,
-            )
-            if result.returncode == 0 and result.stdout.strip() == b"tmp":
-                return binary
-        except Exception:
-            continue
-    return None
-
-
-def _detect_fast_hash() -> Optional[str]:
-    """Check if a SHA-256 binary is available for batch hashing.
-
-    Tries `sha256sum` (GNU coreutils, Linux/Homebrew) then `shasum` (macOS).
-    Returns the binary path if it works, otherwise ``None``.
-    """
-    # Try sha256sum first (Linux, Homebrew gnu-coreutils)
-    binary = shutil.which("sha256sum")
-    if binary:
-        try:
-            result = subprocess.run(
-                [binary, "/dev/null"],
-                capture_output=True,
-                timeout=5,
-            )
-            if (
-                result.returncode == 0
-                and b"e3b0c44298fc1c149afbf4c8996fb924" in result.stdout
-            ):
-                return binary
-        except Exception:
-            pass
-
-    # Try shasum (macOS default)
-    binary = shutil.which("shasum")
-    if binary:
-        try:
-            result = subprocess.run(
-                [binary, "-a", "256", "/dev/null"],
-                capture_output=True,
-                timeout=5,
-            )
-            if (
-                result.returncode == 0
-                and b"e3b0c44298fc1c149afbf4c8996fb924" in result.stdout
-            ):
-                return binary
-        except Exception:
-            pass
-
-    return None
-
-
-def _init_fast_features() -> Tuple[Optional[str], Optional[str]]:
-    global _FAST_FIND_BINARY, _FAST_HASH_BINARY
-    _FAST_FIND_BINARY = _detect_fast_find()
-    _FAST_HASH_BINARY = _detect_fast_hash()
-
-    if _FAST_FIND_BINARY:
-        logger.info(f"Fast file discovery enabled: using {_FAST_FIND_BINARY} -printf")
-    else:
-        logger.info("Fast file discovery unavailable: falling back to os.walk")
-
-    if _FAST_HASH_BINARY:
-        logger.info(f"Fast hashing enabled: using {_FAST_HASH_BINARY}")
-    else:
-        logger.info("Fast hashing unavailable: falling back to Python hashlib")
-
-    return _FAST_FIND_BINARY, _FAST_HASH_BINARY
-
-
-_FAST_FIND_BINARY, _FAST_HASH_BINARY = _init_fast_features()
-
-
-def _hash_file_batch_fast(
-    file_paths: List[str], binary: str
-) -> Dict[str, Optional[str]]:
-    """Hash a batch of files using a native SHA-256 binary.
-
-    Streams output line-by-line via subprocess.Popen for incremental progress.
-
-    Args:
-        file_paths: Paths to hash.
-        binary: Path to sha256sum or shasum.
-
-    Returns a mapping of file_path -> hex_digest (or None on failure).
-    """
-    results: Dict[str, Optional[str]] = {}
-
-    if not file_paths:
-        return results
-
-    # Build command: shasum needs -a 256 prefix, sha256sum doesn't
-    if binary.endswith("sha256sum"):
-        cmd = [binary, "--"] + file_paths
-    else:
-        # shasum
-        cmd = [binary, "-a", "256", "--"] + file_paths
-
-    try:
-        proc = subprocess.Popen(
-            cmd,
-            stdout=subprocess.PIPE,
-            stderr=subprocess.DEVNULL,
-        )
-
-        # Stream output line-by-line for incremental progress
-        if proc.stdout is None:
-            return results
-        for line in iter(proc.stdout.readline, b""):
-            line = line.strip()
-            if not line:
-                continue
-            # Format: "<hash>  <path>" or "<hash> *<path>"
-            parts = line.split(b"  ", 1)
-            if len(parts) != 2:
-                # Try single space with binary marker: "<hash> *<path>"
-                parts = line.split(b" *", 1)
-                if len(parts) != 2:
-                    continue
-
-            file_hash = parts[0].decode("ascii", errors="replace").lower()
-            raw_path = parts[1].decode("utf-8", errors="replace")
-
-            # sha256sum may escape backslashes in filenames; handle common case
-            clean_path = raw_path.replace("\\\\", "\\")
-
-            results[clean_path] = file_hash
-
-        proc.stdout.close()
-        proc.wait()
-
-    except Exception as e:
-        logger.error(f"Native hash batch failed: {e}")
-
-    return results
-
-
-def _discover_files_fast(
-    root_base: str,
-    job_id: Optional[int],
-    batch_size: int,
-    current_timestamp,
-    resolve_tracking,
-    sync_metadata_batch,
-    metrics_lock,
-    metrics,
-    db_session: Session,
-) -> Tuple[int, int]:
-    """Walk a tree using `find -printf` for fast metadata extraction.
-
-    Streams output line-by-line via subprocess.Popen so progress updates
-    appear as files are discovered instead of waiting for find to finish.
-
-    Returns (files_found, files_batched) counts.
-    """
-    total_files_found = 0
-    files_batched = 0
-    pending_metadata: List[Dict[str, Any]] = []
-
-    # -printf format: path\tsize\tmtime (tab-separated; split from right for safety)
-    find_binary = _FAST_FIND_BINARY
-    if find_binary is None:
-        logger.warning(
-            "Fast file discovery requested but no compatible `find` binary found"
-        )
-        return 0, 0
-    cmd = [
-        find_binary,
-        root_base,
-        "-type",
-        "f",
-        "-printf",
-        "%p\t%s\t%T@\n",
-    ]
-
-    try:
-        proc = subprocess.Popen(
-            cmd,
-            stdout=subprocess.PIPE,
-            stderr=subprocess.DEVNULL,
-        )
-        if proc.stdout is None:
-            logger.error(
-                f"Fast file discovery failed: could not open stdout for {root_base}"
-            )
-            return 0, 0
-    except Exception as e:
-        logger.error(f"Fast file discovery failed for {root_base}: {e}")
-        return 0, 0
-
-    # Stream output line by line (tab-separated: path\tsize\tmtime)
-    for line in iter(proc.stdout.readline, b""):
-        if job_id is not None and JobManager.is_cancelled(job_id):
-            break
-
-        if not line.strip():
-            continue
-
-        # Split from right: mtime and size are always numeric
-        parts = line.split(b"\t")
-        if len(parts) < 3:
-            continue
-
-        # First n-2 parts may be path components (tabs in filename are rare)
-        full_file_path = b"\t".join(parts[:-2]).decode("utf-8", errors="replace")
-        try:
-            file_size = int(parts[-2])
-            file_mtime = float(parts[-1])
-        except (ValueError, IndexError):
-            continue
-
-        total_files_found += 1
-        with metrics_lock:
-            metrics["total_files_found"] = total_files_found
-            metrics["current_path"] = os.path.dirname(full_file_path)
-
-        is_ignored = resolve_tracking(full_file_path)
-        pending_metadata.append(
-            {
-                "path": full_file_path,
-                "size": file_size,
-                "mtime": file_mtime,
-                "ignored": is_ignored,
-            }
-        )
-
-        if len(pending_metadata) >= batch_size:
-            sync_metadata_batch(db_session, pending_metadata, current_timestamp)
-            db_session.commit()
-            files_batched += len(pending_metadata)
-            pending_metadata = []
-            if job_id is not None:
-                JobManager.update_job(
-                    job_id,
-                    10.0,
-                    f"Discovered {total_files_found} items...",
-                )
-
-    proc.stdout.close()
-    proc.wait()
-
-    # Flush remaining batch
-    if pending_metadata:
-        sync_metadata_batch(db_session, pending_metadata, current_timestamp)
-        db_session.commit()
-        files_batched += len(pending_metadata)
-
-    return total_files_found, files_batched
-

 class JobManager:
    """Manages operational job states and persistence with high resilience for background threads."""
@@ -443,23 +169,6 @@ class ScannerService:
                    return
                time.sleep(0.1)

-    def _set_process_priority(self, level: str):
-        """Adjusts the CPU and I/O priority of the current process."""
-        try:
-            p = psutil.Process(os.getpid())
-            if level == "background":
-                if hasattr(p, "ionice"):
-                    p.ionice(
-                        psutil.IOPRIO_CLASS_IDLE  # ty: ignore[unresolved-attribute]
-                    )
-                p.nice(19)
-            else:
-                if hasattr(p, "ionice"):
-                    p.ionice(psutil.IOPRIO_CLASS_BE)  # ty: ignore[unresolved-attribute]
-                p.nice(0)
-        except Exception:
-            pass
-
    def compute_sha256(
        self, file_path: str, job_id: Optional[int] = None
    ) -> Optional[str]:
@@ -505,7 +214,9 @@ class ScannerService:
            JobManager.update_job(job_id, 0.0, "Starting system scan...")
            JobManager.add_job_log(job_id, "Starting system scan")

-        self._set_process_priority("normal")
+        from app.core.utils import set_process_priority
+
+        set_process_priority("normal")
        with self._metrics_lock:
            self.files_processed = 0
            self.files_new = 0
@@ -556,63 +267,42 @@ class ScannerService:
                if not os.path.exists(root_base):
                    continue

-                if _FAST_FIND_BINARY:
-                    # Fast path: GNU find -printf (metadata extracted in C)
-                    metrics = {
-                        "total_files_found": 0,
-                        "current_path": root_base,
-                    }
-                    found, _ = _discover_files_fast(
-                        root_base,
-                        job_id,
-                        BATCH_SIZE,
-                        current_timestamp,
-                        resolve_tracking,
-                        self._sync_metadata_batch,
-                        self._metrics_lock,
-                        metrics,
-                        db_session,
-                    )
-                    with self._metrics_lock:
-                        self.total_files_found += found
-                else:
-                    # Compatibility path: Python os.walk + os.stat
-                    for current_dir, _sub_dirs, file_names in os.walk(root_base):
-                        if job_id is not None and JobManager.is_cancelled(job_id):
-                            break
+                for current_dir, _sub_dirs, file_names in os.walk(root_base):
+                    if job_id is not None and JobManager.is_cancelled(job_id):
+                        break

-                        for name in file_names:
-                            full_file_path = os.path.join(current_dir, name)
-                            with self._metrics_lock:
-                                self.total_files_found += 1
-                                self.current_path = current_dir
+                    for name in file_names:
+                        full_file_path = os.path.join(current_dir, name)
+                        with self._metrics_lock:
+                            self.total_files_found += 1
+                            self.current_path = current_dir

-                            try:
-                                file_stats = os.stat(full_file_path)
-                                is_ignored = resolve_tracking(full_file_path)
-                                pending_metadata.append(
-                                    {
-                                        "path": full_file_path,
-                                        "size": file_stats.st_size,
-                                        "mtime": file_stats.st_mtime,
-                                        "ignored": is_ignored,
-                                    }
+                        try:
+                            file_stats = os.stat(full_file_path)
+                            is_ignored = resolve_tracking(full_file_path)
+                            pending_metadata.append(
+                                {
+                                    "path": full_file_path,
+                                    "size": file_stats.st_size,
+                                    "mtime": file_stats.st_mtime,
+                                    "ignored": is_ignored,
+                                }
+                            )
+                        except (OSError, FileNotFoundError):
+                            continue
+
+                        if len(pending_metadata) >= BATCH_SIZE:
+                            self._sync_metadata_batch(
+                                db_session, pending_metadata, current_timestamp
+                            )
+                            db_session.commit()
+                            pending_metadata = []
+                            if job_id is not None:
+                                JobManager.update_job(
+                                    job_id,
+                                    10.0,
+                                    f"Discovered {self.total_files_found} items...",
                                )
-                            except (OSError, FileNotFoundError):
-                                continue
-
-                            if len(pending_metadata) >= BATCH_SIZE:
-                                self._sync_metadata_batch(
-                                    db_session, pending_metadata, current_timestamp
-                                )
-                                db_session.commit()
-                                pending_metadata = []
-                                if job_id is not None:
-                                    JobManager.update_job(
-                                        job_id,
-                                        10.0,
-                                        f"Discovered {self.total_files_found} items...",
-                                    )

            if pending_metadata:
                self._sync_metadata_batch(
@@ -729,7 +419,9 @@ class ScannerService:
        with self._metrics_lock:
            self.is_hashing = True

-        self._set_process_priority("background")
+        from app.core.utils import set_process_priority
+
+        set_process_priority("background")

        try:
            with SessionLocal() as db_session:
@@ -751,10 +443,8 @@ class ScannerService:
                    .count()
                )

-                # Fast hash batch size: more files per batch reduces subprocess overhead
-                HASH_BATCH_SIZE = 100 if _FAST_HASH_BINARY else 100
                # How many files to pull from DB per iteration
-                FETCH_LIMIT = HASH_BATCH_SIZE * 4
+                FETCH_LIMIT = 400

                while self.is_hashing:
                    # Find unindexed work (exclude deleted files - they cannot be hashed)
@@ -780,126 +470,48 @@ class ScannerService:
                    if JobManager.is_cancelled(hashing_job.id):
                        break

-                    if _FAST_HASH_BINARY:
-                        # Fast path: batch files to native sha256sum/shasum
-                        # Group into sub-batches of HASH_BATCH_SIZE for parallel processing
-                        file_paths = [t.file_path for t in hashing_targets]
-                        path_to_record = {t.file_path: t for t in hashing_targets}
+                    # Hash files using Python hashlib via thread pool
+                    max_workers = os.cpu_count() or 4
+                    with concurrent.futures.ThreadPoolExecutor(
+                        max_workers=max_workers
+                    ) as hashing_executor:
+                        future_to_file = {
+                            hashing_executor.submit(
+                                self.compute_sha256,
+                                target.file_path,
+                                hashing_job.id,
+                            ): target
+                            for target in hashing_targets
+                        }

-                        sub_batches = [
-                            file_paths[i : i + HASH_BATCH_SIZE]
-                            for i in range(0, len(file_paths), HASH_BATCH_SIZE)
-                        ]
+                        for future in concurrent.futures.as_completed(future_to_file):
+                            if not self.is_hashing:
+                                break

-                        max_workers = min(os.cpu_count() or 4, len(sub_batches))
-                        with concurrent.futures.ThreadPoolExecutor(
-                            max_workers=max_workers
-                        ) as hashing_executor:
-                            future_to_batch = {
-                                hashing_executor.submit(
-                                    _hash_file_batch_fast,
-                                    batch,
-                                    _FAST_HASH_BINARY,
-                                ): batch
-                                for batch in sub_batches
-                            }
+                            target_record = future_to_file[future]
+                            try:
+                                computed_hash = future.result()
+                            except Exception:
+                                continue

-                            for future in concurrent.futures.as_completed(
-                                future_to_batch
-                            ):
-                                if not self.is_hashing:
-                                    break
-
-                                batch = future_to_batch[future]
-                                try:
-                                    batch_results = future.result()
-                                except Exception:
-                                    continue
-
-                                # Apply hashes and detect missing files ONLY for this batch
-                                for file_path in batch:
-                                    target_record = path_to_record.get(file_path)
-                                    if not target_record:
-                                        continue
-
-                                    if file_path in batch_results:
-                                        target_record.sha256_hash = batch_results[
-                                            file_path
-                                        ]
-                                        with self._metrics_lock:
-                                            self.bytes_hashed += target_record.size or 0
-                                            self.files_hashed += 1
-                                            # Report progress incrementally as files complete
-                                            if self.files_hashed % 5 == 0:
-                                                progress = min(
-                                                    99.9,
-                                                    (
-                                                        self.files_hashed
-                                                        / max(total_pending, 1)
-                                                    )
-                                                    * 100,
-                                                )
-                                                JobManager.update_job(
-                                                    hashing_job.id,
-                                                    progress,
-                                                    f"Hashed {self.files_hashed} files ({self._format_throughput()})...",
-                                                )
-                                    elif not os.path.exists(file_path):
-                                        target_record.is_deleted = True
-                                        with self._metrics_lock:
-                                            self.files_missing += 1
-
-                                # Throttle between sub-batches if I/O pressure is high
+                            if computed_hash:
+                                target_record.sha256_hash = computed_hash
+                                self.files_hashed += 1
+                            elif not os.path.exists(target_record.file_path):
+                                target_record.is_deleted = True
                                with self._metrics_lock:
-                                    should_throttle = self.is_throttled
-                                if should_throttle:
-                                    time.sleep(0.5)
-                    else:
-                        # Compatibility path: Python hashlib via thread pool
-                        max_workers = os.cpu_count() or 4
-                        with concurrent.futures.ThreadPoolExecutor(
-                            max_workers=max_workers
-                        ) as hashing_executor:
-                            future_to_file = {
-                                hashing_executor.submit(
-                                    self.compute_sha256,
-                                    target.file_path,
+                                    self.files_missing += 1
+
+                            if self.files_hashed % 5 == 0:
+                                progress = min(
+                                    99.9,
+                                    (self.files_hashed / max(total_pending, 1)) * 100,
+                                )
+                                JobManager.update_job(
                                    hashing_job.id,
-                                ): target
-                                for target in hashing_targets
-                            }
-
-                            for future in concurrent.futures.as_completed(
-                                future_to_file
-                            ):
-                                if not self.is_hashing:
-                                    break
-
-                                target_record = future_to_file[future]
-                                try:
-                                    computed_hash = future.result()
-                                except Exception:
-                                    continue
-
-                                if computed_hash:
-                                    target_record.sha256_hash = computed_hash
-                                    self.files_hashed += 1
-                                elif not os.path.exists(target_record.file_path):
-                                    target_record.is_deleted = True
-                                    with self._metrics_lock:
-                                        self.files_missing += 1
-
-                                if self.files_hashed % 5 == 0:
-                                    progress = min(
-                                        99.9,
-                                        (self.files_hashed / max(total_pending, 1))
-                                        * 100,
-                                    )
-                                    JobManager.update_job(
-                                        hashing_job.id,
-                                        progress,
-                                        f"Hashed {self.files_hashed} files ({self._format_throughput()})...",
-                                    )
+                                    progress,
+                                    f"Hashed {self.files_hashed} files ({self._format_throughput()})...",
+                                )

                    # Commit batch
                    try:
@@ -82,9 +82,11 @@ def db_session():
            conn.execute(text("PRAGMA foreign_keys = OFF"))
            # Fetch all tables from the metadata
            for table_name in reversed(Base.metadata.tables.keys()):
-                # Avoid truncating internal alembic or FTS tables
-                if "alembic" not in table_name and "fts" not in table_name:
+                # Avoid truncating internal alembic tables
+                if "alembic" not in table_name:
                    conn.execute(text(f"DELETE FROM {table_name}"))
+            # FTS5 virtual table is not in Base.metadata; clear it explicitly
+            conn.execute(text("DELETE FROM filesystem_fts"))
            conn.execute(text("PRAGMA foreign_keys = ON"))


@@ -1,4 +1,5 @@
 from app.db import models
+from app.services.archiver import archiver_manager
 from datetime import datetime, timezone
 import json

@@ -147,14 +148,21 @@ def test_search_index(client, db_session):
    )
    db_session.commit()

-    # Trigger FTS manually since we are using raw SQL triggers which might not have fired
-    # if we didn't insert via SQL or if there are issues in :memory:
-    # but conftest uses a real temp file.
+    # Manually insert into FTS5 since triggers may not fire on ORM inserts in tests
+    from sqlalchemy import text
+
+    db_session.execute(
+        text("INSERT INTO filesystem_fts(rowid, file_path) VALUES (:rowid, :path)"),
+        {"rowid": file1.id, "path": file1.file_path},
+    )
    db_session.commit()

    response = client.get("/archive/search?q=important")
    assert response.status_code == 200
-    # If FTS5 is working, it should return results.
+    data = response.json()
+    assert len(data) == 1
+    assert data[0]["path"] == "data/important.doc"
+    assert data[0]["name"] == "important.doc"


 def test_get_metadata(client, db_session):
@@ -694,3 +702,500 @@ def test_metadata_directory(client, db_session):
    assert data["type"] == "directory"
    assert data["child_count"] == 2
    assert data["size"] == 300
+
+
+# ── List Providers ──
+
+
+def test_list_providers_includes_mock_in_test_mode(client, monkeypatch):
+    """Tests that MockLTOProvider is included when TAPEHOARD_TEST_MODE is set."""
+    monkeypatch.setenv("TAPEHOARD_TEST_MODE", "true")
+    response = client.get("/inventory/providers")
+    assert response.status_code == 200
+    data = response.json()
+    provider_ids = [p["provider_id"] for p in data]
+    assert "lto_tape" in provider_ids
+    assert "local_hdd" in provider_ids
+    assert "s3_compat" in provider_ids
+    assert "mock_lto" in provider_ids
+
+
+# ── List Media with Provider State ──
+
+
+def test_list_media_with_refresh(client, db_session, mocker):
+    """Tests listing media with refresh=True queries hardware status."""
+    media = models.StorageMedia(
+        media_type="local_hdd",
+        identifier="DISK_ONLINE",
+        capacity=1000,
+        status="active",
+        extra_config='{"mount_path": "/tmp"}',
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.check_online.return_value = True
+    mock_provider.identify_media.return_value = "DISK_ONLINE"
+    mock_provider.get_live_info.return_value = {"online": True}
+    mock_provider.mount_base = "/tmp"
+
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=mock_provider,
+    )
+
+    response = client.get("/inventory/media?refresh=true")
+    assert response.status_code == 200
+    data = response.json()
+    assert len(data) == 1
+    assert data[0]["is_online"] is True
+    assert data[0]["is_identified"] is True
+
+
+# ── Create Media Validation ──
+
+
+def test_create_media_duplicate_identifier(client, db_session):
+    """Tests creating media with duplicate identifier returns 400."""
+    db_session.add(
+        models.StorageMedia(
+            media_type="hdd", identifier="DUPE", capacity=1000, status="active"
+        )
+    )
+    db_session.commit()
+
+    response = client.post(
+        "/inventory/media",
+        json={"media_type": "local_hdd", "identifier": "DUPE", "capacity": 1000},
+    )
+    assert response.status_code == 400
+    assert "already exists" in response.json()["detail"]
+
+
+# ── Update Media Edge Cases ──
+
+
+def test_update_media_not_found(client):
+    """Tests updating non-existent media returns 404."""
+    response = client.patch("/inventory/media/99999", json={"location": "Nowhere"})
+    assert response.status_code == 404
+
+
+def test_update_status_to_failed_purges_versions(client, db_session):
+    """Setting status to FAILED should delete all file_versions."""
+    media = models.StorageMedia(
+        media_type="hdd", identifier="DISK_FAIL_001", capacity=1000, status="active"
+    )
+    db_session.add(media)
+    db_session.flush()
+
+    file1 = models.FilesystemState(file_path="data/file1.txt", size=100, mtime=1000)
+    db_session.add(file1)
+    db_session.flush()
+
+    db_session.add(
+        models.FileVersion(
+            filesystem_state_id=file1.id,
+            media_id=media.id,
+            file_number="1",
+            offset_start=0,
+            offset_end=100,
+        )
+    )
+    db_session.commit()
+
+    response = client.patch(
+        f"/inventory/media/{media.id}",
+        json={"status": "FAILED"},
+    )
+    assert response.status_code == 200
+
+    from sqlalchemy import text
+
+    result = db_session.execute(
+        text("SELECT COUNT(*) FROM file_versions WHERE media_id = :media_id"),
+        {"media_id": media.id},
+    ).scalar()
+    assert result == 0
+
+
+def test_update_media_all_lto_fields(client, db_session):
+    """Tests updating all LTO-specific fields."""
+    media = models.StorageMedia(
+        media_type="lto_tape",
+        identifier="LTO_PATCH",
+        capacity=1000,
+        status="active",
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    response = client.patch(
+        f"/inventory/media/{media.id}",
+        json={
+            "generation": "LTO-9",
+            "worm": True,
+            "write_protected": True,
+            "compression": False,
+            "encryption_key_id": "new-key",
+            "cleaning_cartridge": True,
+        },
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["generation"] == "LTO-9"
+    assert data["worm"] is True
+    assert data["write_protected"] is True
+    assert data["compression"] is False
+    assert data["encryption_key_id"] == "new-key"
+    assert data["cleaning_cartridge"] is True
+
+
+def test_update_media_all_hdd_fields(client, db_session):
+    """Tests updating all HDD-specific fields."""
+    media = models.StorageMedia(
+        media_type="local_hdd",
+        identifier="HDD_PATCH",
+        capacity=1000,
+        status="active",
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    response = client.patch(
+        f"/inventory/media/{media.id}",
+        json={
+            "drive_model": "WD-Red",
+            "device_uuid": "uuid-123",
+            "is_ssd": True,
+            "mount_path": "/mnt/backup",
+            "filesystem_type": "ext4",
+            "connection_interface": "USB3",
+            "encrypted": True,
+        },
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["drive_model"] == "WD-Red"
+    assert data["device_uuid"] == "uuid-123"
+    assert data["is_ssd"] is True
+    assert data["mount_path"] == "/mnt/backup"
+    assert data["filesystem_type"] == "ext4"
+    assert data["connection_interface"] == "USB3"
+    assert data["encrypted"] is True
+
+
+def test_update_media_all_cloud_fields(client, db_session):
+    """Tests updating all cloud-specific fields."""
+    media = models.StorageMedia(
+        media_type="s3_compat",
+        identifier="CLOUD_PATCH",
+        capacity=1000,
+        status="active",
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    response = client.patch(
+        f"/inventory/media/{media.id}",
+        json={
+            "provider_template": "wasabi",
+            "endpoint_url": "https://s3.wasabisys.com",
+            "region": "us-east-1",
+            "bucket_name": "my-bucket",
+            "access_key_id": "AKIA...",
+            "secret_access_key_name": "wasabi-key",
+            "path_style_access": False,
+            "storage_class": "STANDARD",
+            "max_part_size_mb": 1000,
+            "obfuscate_filenames": True,
+            "encryption_secret_name": "enc-secret",
+        },
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["provider_template"] == "wasabi"
+    assert data["endpoint_url"] == "https://s3.wasabisys.com"
+    assert data["region"] == "us-east-1"
+    assert data["bucket_name"] == "my-bucket"
+    assert data["access_key_id"] == "AKIA..."
+    assert data["secret_access_key_name"] == "wasabi-key"
+    assert data["path_style_access"] is False
+    assert data["storage_class"] == "STANDARD"
+    assert data["max_part_size_mb"] == 1000
+    assert data["obfuscate_filenames"] is True
+    assert data["encryption_secret_name"] == "enc-secret"
+
+
+def test_update_media_legacy_extra_config_migration(client, db_session):
+    """Tests that legacy extra_config keys are migrated to first-class columns."""
+    media = models.StorageMedia(
+        media_type="local_hdd",
+        identifier="LEGACY_001",
+        capacity=1000,
+        status="active",
+        extra_config=json.dumps(
+            {
+                "device_path": "/mnt/legacy",
+                "encryption_key": "legacy-key",
+                "encryption_passphrase": "legacy-pass",
+            }
+        ),
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    response = client.patch(
+        f"/inventory/media/{media.id}",
+        json={"location": "Migrated"},
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["mount_path"] == "/mnt/legacy"
+    assert data["encryption_key_id"] == "legacy-key"
+
+
+# ── Delete Media ──
+
+
+def test_delete_media_not_found(client):
+    """Tests deleting non-existent media returns 404."""
+    response = client.delete("/inventory/media/99999")
+    assert response.status_code == 404
+
+
+# ── Initialize Media ──
+
+
+def test_initialize_media_not_found(client):
+    """Tests initializing non-existent media returns 404."""
+    response = client.post("/inventory/media/99999/initialize")
+    assert response.status_code == 404
+
+
+def test_initialize_media_no_provider(client, db_session, mocker):
+    """Tests initializing media with unsupported type returns 400."""
+    media = models.StorageMedia(
+        media_type="hdd", identifier="NO_PROV", capacity=1000, status="active"
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=None,
+    )
+
+    response = client.post(f"/inventory/media/{media.id}/initialize")
+    assert response.status_code == 400
+    assert "provider not found" in response.json()["detail"]
+
+
+def test_initialize_media_existing_data_blocks(client, db_session, mocker):
+    """Tests initialize blocks when existing data found and force=False."""
+    media = models.StorageMedia(
+        media_type="hdd", identifier="HAS_DATA", capacity=1000, status="active"
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.check_existing_data.return_value = True
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=mock_provider,
+    )
+
+    response = client.post(f"/inventory/media/{media.id}/initialize")
+    assert response.status_code == 409
+    assert "existing data" in response.json()["detail"]
+
+
+def test_initialize_media_force_overwrite(client, db_session, mocker):
+    """Tests initialize with force=True overwrites existing data."""
+    media = models.StorageMedia(
+        media_type="hdd",
+        identifier="FORCE_INIT",
+        capacity=1000,
+        status="active",
+        extra_config='{"mount_path": "/tmp"}',
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.check_existing_data.return_value = True
+    mock_provider.initialize_media.return_value = True
+    mock_provider.device_path = "/tmp/init"
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=mock_provider,
+    )
+
+    response = client.post(f"/inventory/media/{media.id}/initialize?force=true")
+    assert response.status_code == 200
+    assert "complete" in response.json()["message"]
+
+
+def test_initialize_media_permission_error(client, db_session, mocker):
+    """Tests initialize handles PermissionError."""
+    media = models.StorageMedia(
+        media_type="hdd", identifier="PERM_DENY", capacity=1000, status="active"
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.check_existing_data.return_value = False
+    mock_provider.initialize_media.side_effect = PermissionError("Access denied")
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=mock_provider,
+    )
+
+    response = client.post(f"/inventory/media/{media.id}/initialize")
+    assert response.status_code == 403
+    assert "Access denied" in response.json()["detail"]
+
+
+# ── Reorder Media ──
+
+
+def test_reorder_media(client, db_session):
+    """Tests reordering media priority."""
+    m1 = models.StorageMedia(
+        media_type="hdd", identifier="A", capacity=1000, status="active"
+    )
+    m2 = models.StorageMedia(
+        media_type="hdd", identifier="B", capacity=1000, status="active"
+    )
+    db_session.add_all([m1, m2])
+    db_session.commit()
+
+    response = client.post(
+        "/inventory/media/reorder", json={"media_ids": [m2.id, m1.id]}
+    )
+    assert response.status_code == 200
+
+    db_session.expire_all()
+    assert db_session.get(models.StorageMedia, m2.id).priority_index == 0
+    assert db_session.get(models.StorageMedia, m1.id).priority_index == 1
+
+
+# ── Insights Deep Tests ──
+
+
+def test_insights_with_duplicates_and_aging(client, db_session):
+    """Tests insights reports duplicates, aging, redundancy, and extensions."""
+    now = datetime.now(timezone.utc).timestamp()
+
+    # Two files with same hash (duplicate)
+    f1 = models.FilesystemState(
+        file_path="/data/a.txt", size=100, mtime=now, sha256_hash="duphash"
+    )
+    f2 = models.FilesystemState(
+        file_path="/data/b.txt", size=100, mtime=now, sha256_hash="duphash"
+    )
+    # Protected file
+    f3 = models.FilesystemState(
+        file_path="/data/c.txt",
+        size=200,
+        mtime=now - 400 * 24 * 3600,
+        sha256_hash="hash3",
+    )
+    db_session.add_all([f1, f2, f3])
+    db_session.flush()
+
+    media = models.StorageMedia(
+        media_type="hdd", identifier="M1", capacity=1000, status="active"
+    )
+    db_session.add(media)
+    db_session.flush()
+
+    db_session.add(
+        models.FileVersion(
+            filesystem_state_id=f3.id,
+            media_id=media.id,
+            file_number="1",
+            offset_start=0,
+            offset_end=200,
+        )
+    )
+    db_session.commit()
+
+    response = client.get("/inventory/insights")
+    assert response.status_code == 200
+    data = response.json()
+
+    assert data["summary"]["total_files"] == 3
+    assert data["summary"]["total_bytes"] == 400
+    assert data["summary"]["protected_bytes"] == 200
+    assert data["summary"]["vulnerable_bytes"] == 200
+
+    assert len(data["duplicates"]) == 1
+    assert data["duplicates"][0]["copies"] == 2
+    assert data["duplicates"][0]["saved"] == 100
+
+    assert len(data["extensions"]) >= 1
+    assert any(e["ext"] == "txt" for e in data["extensions"])
+
+    assert len(data["redundancy"]) >= 1
+
+
+# ── Treemap / Directories ──
+
+
+def test_get_treemap(client, db_session):
+    """Tests the treemap endpoint returns hierarchical directory data."""
+    f1 = models.FilesystemState(file_path="/data/sub/file1.txt", size=100, mtime=1000)
+    f2 = models.FilesystemState(file_path="/data/sub/file2.txt", size=200, mtime=1000)
+    db_session.add_all([f1, f2])
+    db_session.commit()
+
+    response = client.get("/inventory/directories")
+    assert response.status_code == 200
+    data = response.json()
+    assert isinstance(data, list)
+    # Should contain data directory
+    assert len(data) >= 1
+
+
+# ── Detect Media ──
+
+
+def test_detect_media_finds_new_insertion(client, db_session, mocker):
+    """Tests detect_media finds newly inserted unregistered media."""
+    media = models.StorageMedia(
+        media_type="lto_tape",
+        identifier="EXISTING_TAPE",
+        capacity=1000,
+        status="active",
+        extra_config=json.dumps({"device_path": "/dev/nst0"}),
+    )
+    db_session.add(media)
+    db_session.commit()
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.provider_id = "lto_tape"
+    mock_provider.check_online.return_value = True
+    mock_provider.get_live_info.return_value = {"identity": "NEW_TAPE_01"}
+
+    mocker.patch.object(
+        archiver_manager,
+        "_get_storage_provider",
+        return_value=mock_provider,
+    )
+
+    response = client.get("/inventory/detect")
+    assert response.status_code == 200
+    data = response.json()
+    assert len(data) == 1
+    assert data[0]["identifier"] == "NEW_TAPE_01"
+    assert data[0]["device_path"] == "/dev/nst0"
@@ -1,4 +1,4 @@
-from datetime import datetime, timezone
+from datetime import datetime, timedelta, timezone

 import pytest

@@ -17,19 +17,22 @@ def test_list_jobs_empty(client):

 def test_list_jobs_populated(client, db_session):
    """Tests listing jobs with pagination and latest_log inclusion."""
+    now = datetime.now(timezone.utc)
    job1 = models.Job(
        job_type="SCAN",
        status="COMPLETED",
        progress=100.0,
        current_task="Done",
-        started_at=datetime.now(timezone.utc),
-        completed_at=datetime.now(timezone.utc),
+        started_at=now - timedelta(seconds=2),
+        completed_at=now - timedelta(seconds=1),
+        created_at=now - timedelta(seconds=2),
    )
    job2 = models.Job(
        job_type="BACKUP",
        status="RUNNING",
        progress=50.0,
        current_task="Writing archive",
+        created_at=now,
    )
    db_session.add_all([job1, job2])
    db_session.flush()
@@ -0,0 +1,196 @@
+from app.db import models
+
+# ── Settings CRUD ──
+
+
+def test_get_settings_empty(client):
+    """Tests retrieving settings when none are set."""
+    response = client.get("/system/settings")
+    assert response.status_code == 200
+    assert response.json() == {}
+
+
+def test_update_settings(client):
+    """Tests updating a system setting."""
+    response = client.post(
+        "/system/settings", json={"key": "schedule_scan", "value": "0 2 * * *"}
+    )
+    assert response.status_code == 200
+    assert response.json() == {"message": "Setting committed."}
+
+    # Verify retrieval
+    response = client.get("/system/settings")
+    assert response.json()["schedule_scan"] == "0 2 * * *"
+
+
+def test_update_settings_triggers_scheduler_reload(client, mocker):
+    """Tests that updating schedule_scan reloads the scheduler."""
+    from app.services.scheduler import scheduler_manager
+
+    reload_spy = mocker.spy(scheduler_manager, "reload")
+    response = client.post(
+        "/system/settings", json={"key": "schedule_scan", "value": "0 3 * * *"}
+    )
+    assert response.status_code == 200
+    reload_spy.assert_called_once()
+
+
+def test_update_global_exclusions_recomputes_policy(client, db_session, mocker):
+    """Tests that updating global_exclusions triggers policy recompute."""
+    recompute_spy = mocker.patch("app.api.system.settings.recompute_exclusion_policy")
+    response = client.post(
+        "/system/settings",
+        json={"key": "global_exclusions", "value": "*.tmp\n*.log"},
+    )
+    assert response.status_code == 200
+    recompute_spy.assert_called_once()
+
+
+# ── Exclusion Testing ──
+
+
+def test_test_exclusions_empty_patterns(client):
+    """Tests exclusion test with empty patterns returns zeros."""
+    response = client.post(
+        "/system/settings/test-exclusions",
+        json={"patterns": "", "limit": 10},
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["total_files"] == 0
+    assert data["matched_count"] == 0
+    assert data["sample"] == []
+
+
+def test_test_exclusions_matches_files(client, db_session):
+    """Tests exclusion patterns against indexed files."""
+    db_session.add_all(
+        [
+            models.FilesystemState(
+                file_path="/data/file.txt", size=100, mtime=1000, is_deleted=False
+            ),
+            models.FilesystemState(
+                file_path="/data/temp.tmp", size=50, mtime=1000, is_deleted=False
+            ),
+            models.FilesystemState(
+                file_path="/data/debug.log", size=200, mtime=1000, is_deleted=False
+            ),
+        ]
+    )
+    db_session.commit()
+
+    response = client.post(
+        "/system/settings/test-exclusions",
+        json={"patterns": "*.tmp\n*.log", "limit": 10},
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["total_files"] == 3
+    assert data["matched_count"] == 2
+    assert data["matched_size"] == 250
+    assert len(data["sample"]) == 2
+
+
+def test_test_exclusions_deleted_files_excluded(client, db_session):
+    """Tests that deleted files are excluded from exclusion testing."""
+    db_session.add_all(
+        [
+            models.FilesystemState(
+                file_path="/data/keep.txt",
+                size=100,
+                mtime=1000,
+                is_deleted=False,
+            ),
+            models.FilesystemState(
+                file_path="/data/old.tmp",
+                size=50,
+                mtime=1000,
+                is_deleted=True,
+            ),
+        ]
+    )
+    db_session.commit()
+
+    response = client.post(
+        "/system/settings/test-exclusions",
+        json={"patterns": "*.tmp", "limit": 10},
+    )
+    assert response.status_code == 200
+    data = response.json()
+    assert data["total_files"] == 1
+    assert data["matched_count"] == 0
+
+
+# ── Exclusion Report Download ──
+
+
+def test_download_exclusion_report(client, db_session):
+    """Tests CSV report generation for exclusion matches."""
+    db_session.add(
+        models.FilesystemState(
+            file_path="/data/target.log", size=100, mtime=1000, is_deleted=False
+        )
+    )
+    db_session.commit()
+
+    response = client.post(
+        "/system/settings/test-exclusions/download",
+        json={"patterns": "*.log", "limit": 10},
+    )
+    assert response.status_code == 200
+    assert response.headers["content-type"] == "text/csv; charset=utf-8"
+    assert "exclusion_report.csv" in response.headers["content-disposition"]
+    content = response.content.decode("utf-8")
+    assert "path,size,mtime,sha256_hash" in content
+    assert "target.log" in content
+
+
+def test_download_exclusion_report_no_patterns(client):
+    """Tests download with empty patterns returns 400."""
+    response = client.post(
+        "/system/settings/test-exclusions/download",
+        json={"patterns": "", "limit": 10},
+    )
+    assert response.status_code == 400
+    assert "No patterns provided" in response.json()["detail"]
+
+
+# ── Secrets Keystore (complementing test_api_system.py) ──
+
+
+def test_create_secret(client):
+    """Tests creating a secret."""
+    response = client.post(
+        "/system/secrets", json={"name": "api-key", "value": "secret123"}
+    )
+    assert response.status_code == 200
+    assert "stored" in response.json()["message"]
+
+    response = client.get("/system/secrets")
+    assert "api-key" in response.json()
+
+
+def test_get_secret_value(client):
+    """Tests retrieving a secret value."""
+    client.post("/system/secrets", json={"name": "key-1", "value": "val-1"})
+
+    response = client.get("/system/secrets/key-1")
+    assert response.status_code == 200
+    assert response.json()["value"] == "val-1"
+
+
+def test_delete_secret(client):
+    """Tests deleting a secret."""
+    client.post("/system/secrets", json={"name": "to-delete", "value": "x"})
+
+    response = client.request("DELETE", "/system/secrets", json={"name": "to-delete"})
+    assert response.status_code == 200
+
+    response = client.get("/system/secrets")
+    assert "to-delete" not in response.json()
+
+
+def test_delete_secret_not_found(client):
+    """Tests deleting a non-existent secret returns 404."""
+    response = client.request("DELETE", "/system/secrets", json={"name": "missing"})
+    assert response.status_code == 404
@@ -1,3 +1,4 @@
+import json
 from datetime import datetime, timezone

 from app.db import models
@@ -52,13 +53,6 @@ def test_update_settings(client):
    assert response.json()["schedule_scan"] == "0 2 * * *"


-def test_list_jobs_empty(client):
-    """Tests listing jobs when none exist."""
-    response = client.get("/system/jobs")
-    assert response.status_code == 200
-    assert response.json() == []
-
-
 def test_trigger_scan(client):
    """Tests triggering a system scan."""
    response = client.post("/system/scan")
@@ -77,10 +71,17 @@ def test_get_scan_status(client):


 def test_ls_root(client):
-    """Tests listing the root directory."""
+    """Tests listing the root directory returns actual subdirectories."""
    response = client.get("/system/ls?path=/")
    assert response.status_code == 200
-    assert isinstance(response.json(), list)
+    data = response.json()
+    assert isinstance(data, list)
+    assert len(data) > 0
+    for entry in data:
+        assert "name" in entry
+        assert "path" in entry
+        assert entry["name"] != ""
+        assert entry["path"] != ""


 def test_ignore_hardware(client):
@@ -104,127 +105,6 @@ def test_scan_status_includes_files_missing(client):
    assert data["files_missing"] == 0


-def test_list_discrepancies_empty(client):
-    """Tests listing discrepancies when none exist."""
-    response = client.get("/system/discrepancies")
-    assert response.status_code == 200
-    assert response.json() == []
-
-
-def test_list_discrepancies_deleted_file(client, db_session):
-    """Tests listing a confirmed-deleted file in discrepancies."""
-    file_record = models.FilesystemState(
-        file_path="/data/old.txt",
-        size=100,
-        mtime=1000,
-        is_deleted=True,
-        is_ignored=False,
-        sha256_hash=None,
-    )
-    db_session.add(file_record)
-    db_session.commit()
-
-    response = client.get("/system/discrepancies")
-    assert response.status_code == 200
-    data = response.json()
-    assert len(data) == 1
-    assert data[0]["path"] == "/data/old.txt"
-    assert data[0]["is_deleted"] is True
-
-
-def test_confirm_file_deleted(client, db_session):
-    """Tests confirming a file as deleted."""
-    file_record = models.FilesystemState(
-        file_path="/data/verify.txt",
-        size=50,
-        mtime=2000,
-        is_deleted=False,
-    )
-    db_session.add(file_record)
-    db_session.commit()
-
-    response = client.post(f"/system/discrepancies/{file_record.id}/confirm")
-    assert response.status_code == 200
-    assert "marked as deleted" in response.json()["message"]
-
-    db_session.expire_all()
-    db_session.refresh(file_record)
-    assert file_record.is_deleted is True
-
-
-def test_confirm_file_deleted_not_found(client):
-    """Tests confirming a non-existent file returns 404."""
-    response = client.post("/system/discrepancies/9999/confirm")
-    assert response.status_code == 404
-
-
-def test_dismiss_discrepancy(client, db_session):
-    """Tests dismissing a deleted file."""
-    file_record = models.FilesystemState(
-        file_path="/data/dismiss.txt",
-        size=50,
-        mtime=2000,
-        is_deleted=True,
-    )
-    db_session.add(file_record)
-    db_session.commit()
-
-    response = client.post(f"/system/discrepancies/{file_record.id}/dismiss")
-    assert response.status_code == 200
-    assert "dismissed" in response.json()["message"]
-
-    db_session.expire_all()
-    db_session.refresh(file_record)
-    assert file_record.missing_acknowledged_at is not None
-
-
-def test_delete_file_record(client, db_session):
-    """Tests hard-deleting a file record and its versions."""
-    media = models.StorageMedia(
-        media_type="hdd", identifier="M1", capacity=1000, status="active"
-    )
-    db_session.add(media)
-    db_session.flush()
-
-    file_record = models.FilesystemState(
-        file_path="/data/hard_delete.txt",
-        size=100,
-        mtime=1000,
-        is_deleted=True,
-    )
-    db_session.add(file_record)
-    db_session.flush()
-
-    db_session.add(
-        models.FileVersion(
-            filesystem_state_id=file_record.id,
-            media_id=media.id,
-            file_number="1",
-            offset_start=0,
-            offset_end=100,
-        )
-    )
-    db_session.commit()
-
-    file_id = file_record.id
-
-    response = client.delete(f"/system/discrepancies/{file_id}")
-    assert response.status_code == 200
-
-    db_session.expire_all()
-
-    # Verify file and version are gone
-    assert (
-        db_session.query(models.FilesystemState).filter_by(id=file_id).first() is None
-    )
-    assert (
-        db_session.query(models.FileVersion)
-        .filter_by(filesystem_state_id=file_id)
-        .first()
-        is None
-    )
-
-
 def test_dashboard_stats_excludes_failed_media(client, db_session):
    """Tests that dashboard stats do not count versions on failed or retired media."""
    active_media = models.StorageMedia(
@@ -593,10 +473,13 @@ def test_ignore_hardware_duplicate(client):


 def test_database_export(client):
-    """Tests database export endpoint returns a file response."""
+    """Tests database export endpoint returns a SQLite file download."""
    response = client.get("/system/database/export")
-    # May return 200 with file or 404 if db path not found
-    assert response.status_code in (200, 404)
+    assert response.status_code == 200
+    assert "tapehoard_index_" in response.headers["content-disposition"]
+    assert ".db" in response.headers["content-disposition"]
+    # Should contain SQLite magic bytes
+    assert response.content[:16] == b"SQLite format 3\x00"


 # ── Tracking Batch ──
@@ -676,5 +559,69 @@ def test_test_notification_invalid_url(client):
    response = client.post(
        "/system/notifications/test", json={"url": "not-a-valid-url"}
    )
-    # Notification manager may succeed or fail depending on apprise parsing
-    assert response.status_code in (200, 500)
+    assert response.status_code == 500
+    assert "Failed to dispatch test alert" in response.json()["detail"]
+
+
+# ── Host Directory Listing ──
+
+
+def test_ls_traversal_rejected(client):
+    """Tests that path traversal attempts are blocked."""
+    response = client.get("/system/ls?path=/etc/../secret")
+    assert response.status_code == 403
+    assert "Path traversal not allowed" in response.json()["detail"]
+
+
+def test_ls_nonexistent_path(client):
+    """Tests listing a non-existent directory returns empty list."""
+    response = client.get("/system/ls?path=/nonexistent_path_12345")
+    assert response.status_code == 200
+    assert response.json() == []
+
+
+# ── System Tree ──
+
+
+def test_system_tree_root(client, db_session):
+    """Tests system tree at ROOT returns configured source roots."""
+    db_session.add(models.SystemSetting(key="source_roots", value='["/source_data"]'))
+    db_session.commit()
+
+    response = client.get("/system/tree")
+    assert response.status_code == 200
+    data = response.json()
+    assert len(data) == 1
+    assert data[0]["name"] == "/source_data"
+    assert data[0]["has_children"] is True
+
+
+def test_system_tree_subdirectory(client, db_session):
+    """Tests system tree browsing a subdirectory."""
+    import tempfile
+
+    with tempfile.TemporaryDirectory() as tmpdir:
+        db_session.add(
+            models.SystemSetting(key="source_roots", value=json.dumps([tmpdir]))
+        )
+        db_session.commit()
+
+        # Create a subdirectory
+        import os
+
+        os.makedirs(os.path.join(tmpdir, "subdir"))
+
+        response = client.get(f"/system/tree?path={tmpdir}")
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data) == 1
+        assert data[0]["name"] == "subdir"
+
+
+def test_system_tree_outside_roots(client, db_session):
+    """Tests tree browsing outside roots returns 403."""
+    db_session.add(models.SystemSetting(key="source_roots", value='["/source_data"]'))
+    db_session.commit()
+
+    response = client.get("/system/tree?path=/etc")
+    assert response.status_code == 403
@@ -1,84 +1,379 @@
-import hashlib
+import io
+import os
 import pytest
+from unittest.mock import MagicMock
+
 from app.providers.cloud import CloudStorageProvider


-def test_cloud_provider_obfuscation_logic():
-    """Verifies that filename hashing and sharding works as expected."""
-
-    # CASE 1: Obfuscation Disabled
-    config_plain = {
-        "bucket_name": "test-bucket",
-        "obfuscate_filenames": False,
-        "access_key": "fake",
-        "secret_key": "fake",
-    }
-    provider_plain = CloudStorageProvider(config_plain)
-    path = "documents/secret_plan.pdf"
-
-    # Expectation: Key is exactly the path with prefix
-    key_plain = provider_plain._get_obfuscated_key("objects", path)
-    assert key_plain == "objects/documents/secret_plan.pdf"
-
-    # CASE 2: Obfuscation Enabled
-    config_hidden = {
-        "bucket_name": "test-bucket",
-        "obfuscate_filenames": True,
-        "access_key": "fake",
-        "secret_key": "fake",
-    }
-    provider_hidden = CloudStorageProvider(config_hidden)
-
-    # Expectation: Key is hashed and sharded
-    # hash of "documents/secret_plan.pdf"
-    expected_hash = hashlib.sha256(path.encode("utf-8")).hexdigest()
-    expected_prefix = f"objects/{expected_hash[:2]}/{expected_hash[2:4]}"
-
-    key_hidden = provider_hidden._get_obfuscated_key("objects", path)
-
-    assert key_hidden.startswith("objects/")
-    assert key_hidden == f"{expected_prefix}/{expected_hash}"
-    assert "secret_plan.pdf" not in key_hidden
+# ── Constructor & Config ──


-def test_cloud_secret_lookup(mocker, db_session):
-    """Verifies that the provider looks up secrets from the keystore by name."""
-    from app.db import models
+def test_cloud_provider_endpoint_normalization(mocker):
+    """Tests that endpoint URLs without protocol get https:// prepended."""
+    mock_boto = mocker.patch("app.providers.cloud.boto3")

-    # Mock boto3.client to avoid slow initialization in unit tests
+    provider = CloudStorageProvider(
+        {
+            "bucket_name": "test-bucket",
+            "endpoint_url": "s3.example.com",
+            "region": "eu-west-1",
+            "access_key": "ak",
+            "secret_key": "sk",
+        }
+    )
+
+    call_kwargs = mock_boto.client.call_args[1]
+    assert call_kwargs["endpoint_url"] == "https://s3.example.com"
+    assert call_kwargs["region_name"] == "eu-west-1"
+    assert provider.provider_type == "S3"
+
+
+def test_cloud_provider_endpoint_no_modification(mocker):
+    """Tests that endpoint URLs with existing protocol are left alone."""
+    mock_boto = mocker.patch("app.providers.cloud.boto3")
+
+    CloudStorageProvider(
+        {
+            "bucket_name": "test-bucket",
+            "endpoint_url": "http://minio.local:9000",
+        }
+    )
+
+    call_kwargs = mock_boto.client.call_args[1]
+    assert call_kwargs["endpoint_url"] == "http://minio.local:9000"
+
+
+def test_cloud_provider_defaults(mocker):
+    """Tests default values when minimal config is provided."""
+    mock_boto = mocker.patch("app.providers.cloud.boto3")
+
+    provider = CloudStorageProvider({"bucket_name": "b"})
+
+    assert provider.region == "us-east-1"
+    assert provider.endpoint_url is None
+    assert provider.obfuscate is False
+    mock_boto.client.assert_called_once()
+
+
+# ── Online & Identification ──
+
+
+def test_check_online_success(mocker):
+    """Tests check_online returns True when head_bucket succeeds."""
    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.head_bucket = MagicMock(return_value=None)

-    # Seed the secrets keystore
+    assert provider.check_online() is True
+
+
+def test_check_online_failure(mocker):
+    """Tests check_online returns False when head_bucket raises."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.head_bucket = MagicMock(side_effect=Exception("timeout"))
+
+    assert provider.check_online() is False
+
+
+def test_get_live_info(mocker):
+    """Tests get_live_info returns provider metadata."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "my-bucket"})
+    provider.s3.head_bucket = MagicMock(return_value=None)
+
+    info = provider.get_live_info()
+    assert info["online"] is True
+    assert info["provider"] == "S3"
+    assert info["bucket"] == "my-bucket"
+
+
+def test_check_existing_data_found(mocker):
+    """Tests check_existing_data when objects exist under archives/."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.list_objects_v2 = MagicMock(
+        return_value={"Contents": [{"Key": "archives/1.tar"}]}
+    )
+
+    assert provider.check_existing_data() is True
+
+
+def test_check_existing_data_empty(mocker):
+    """Tests check_existing_data when no objects exist."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.list_objects_v2 = MagicMock(return_value={})
+
+    assert provider.check_existing_data() is False
+
+
+def test_identify_media_by_id_file(mocker):
+    """Tests identify_media reads .tapehoard_id when available."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    mock_body = MagicMock()
+    mock_body.read.return_value = b"  BUCKET_001  "
+    provider.s3.get_object = MagicMock(return_value={"Body": mock_body})
+
+    result = provider.identify_media()
+    assert result == "BUCKET_001"
+
+
+def test_identify_media_fallback_to_bucket_name(mocker):
+    """Tests identify_media falls back to bucket name when .tapehoard_id missing."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "fallback-bucket"})
+    provider.s3.get_object = MagicMock(side_effect=Exception("NoSuchKey"))
+    provider.s3.head_bucket = MagicMock(return_value=None)
+
+    result = provider.identify_media()
+    assert result == "fallback-bucket"
+
+
+def test_identify_media_complete_failure(mocker):
+    """Tests identify_media returns None when everything fails."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.get_object = MagicMock(side_effect=Exception("fail"))
+    provider.s3.head_bucket = MagicMock(side_effect=Exception("fail"))
+
+    assert provider.identify_media() is None
+
+
+# ── Write Operations ──
+
+
+def test_write_archive_plain(mocker):
+    """Tests writing an unencrypted archive."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b", "obfuscate_filenames": False})
+
+    stream = io.BytesIO(b"archive content")
+    provider.s3.upload_fileobj = MagicMock(return_value=None)
+
+    location = provider.write_archive("M1", stream)
+
+    assert location.startswith("archives/archives/")
+    assert location.endswith(".tar")
+    provider.s3.upload_fileobj.assert_called_once()
+
+
+def test_write_file_direct_plain(mocker):
+    """Tests writing an unencrypted object directly."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b", "obfuscate_filenames": False})
+
+    stream = io.BytesIO(b"file content")
+    provider.s3.upload_fileobj = MagicMock(return_value=None)
+
+    location = provider.write_file_direct("M1", "photos/image.jpg", stream)
+
+    assert location == "objects/photos/image.jpg"
+
+
+def test_initialize_media_clears_and_tags(mocker):
+    """Tests initialize_media clears existing objects and writes .tapehoard_id."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+
+    provider.s3.head_bucket = MagicMock(return_value=None)
+    mock_paginator = MagicMock()
+    mock_paginator.paginate = MagicMock(
+        return_value=[{"Contents": [{"Key": "old1"}, {"Key": "old2"}]}]
+    )
+    provider.s3.get_paginator = MagicMock(return_value=mock_paginator)
+    provider.s3.delete_objects = MagicMock(return_value=None)
+    provider.s3.put_object = MagicMock(return_value=None)
+
+    result = provider.initialize_media("NEW_DISK")
+
+    assert result is True
+    provider.s3.delete_objects.assert_called_once()
+    provider.s3.put_object.assert_called_once()
+    call_kwargs = provider.s3.put_object.call_args[1]
+    assert call_kwargs["Key"] == ".tapehoard_id"
+    assert call_kwargs["Body"] == b"NEW_DISK"
+
+
+def test_initialize_media_failure(mocker):
+    """Tests initialize_media returns False on error."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.head_bucket = MagicMock(side_effect=Exception("no access"))
+
+    assert provider.initialize_media("X") is False
+
+
+def test_prepare_for_write_match(mocker):
+    """Tests prepare_for_write when media identifier matches."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    provider.s3.head_bucket = MagicMock(return_value=None)
+    provider.s3.get_object = MagicMock(side_effect=Exception("not found"))
+
+    # Fallback to bucket name
+    assert provider.prepare_for_write("b") is True
+    assert provider.prepare_for_write("wrong") is False
+
+
+# ── Read Operations ──
+
+
+def test_read_archive_plain(mocker):
+    """Tests reading an unencrypted archive."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+
+    provider.s3.get_object = MagicMock(
+        return_value={
+            "Body": io.BytesIO(b"raw archive data"),
+            "Metadata": {},
+        }
+    )
+
+    result = provider.read_archive("M1", "archives/1.tar")
+    assert result.read() == b"raw archive data"
+
+
+def test_read_archive_encrypted(mocker, db_session):
+    """Tests round-trip encryption/decryption for archives."""
+    from app.db import models
+    from Crypto.Cipher import AES
+    from Crypto.Protocol.KDF import PBKDF2
+    from Crypto.Hash import SHA256
+
+    # Seed passphrase in keystore
    db_session.add(
-        models.SystemSetting(
-            key="secrets",
-            value='{"my-encryption-key": "local-override", "empty-secret": ""}',
-        )
+        models.SystemSetting(key="secrets", value='{"cloud-enc": "my-passphrase-123"}')
    )
    db_session.commit()

-    # CASE 1: Secret name provided and exists in keystore
-    config_local = {
-        "bucket_name": "b",
-        "encryption_secret_name": "my-encryption-key",
-    }
-    provider_local = CloudStorageProvider(config_local)
-    assert provider_local.passphrase == "local-override"
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider(
+        {
+            "bucket_name": "b",
+            "encryption_secret_name": "cloud-enc",
+        }
+    )

-    # CASE 2: No secret name provided, passphrase is None
-    config_empty = {"bucket_name": "b"}
-    provider_fallback = CloudStorageProvider(config_empty)
-    assert provider_fallback.passphrase is None
+    # Encrypt data ourselves to simulate stored payload
+    original_data = b"secret archive content"
+    salt = os.urandom(16)
+    nonce = os.urandom(12)
+    key = PBKDF2(
+        "my-passphrase-123", salt, dkLen=32, count=100000, hmac_hash_module=SHA256
+    )
+    cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
+    ciphertext, tag = cipher.encrypt_and_digest(original_data)
+    payload = salt + nonce + tag + ciphertext

-    # CASE 3: Secret name provided but value is empty string
-    config_empty_secret = {
-        "bucket_name": "b",
-        "encryption_secret_name": "empty-secret",
-    }
-    provider_empty = CloudStorageProvider(config_empty_secret)
-    assert provider_empty.passphrase == ""
+    provider.s3.get_object = MagicMock(
+        return_value={
+            "Body": io.BytesIO(payload),
+            "Metadata": {"tapehoard-encrypted": "v2-gcm"},
+        }
+    )

-    # CASE 4: No passphrase anywhere (ValueError on key derivation)
-    provider_none = CloudStorageProvider({"bucket_name": "b"})
-    with pytest.raises(ValueError, match="No encryption passphrase configured"):
-        provider_none._derive_key(b"salt")
+    result = provider.read_archive("M1", "archives/enc.tar")
+    assert result.read() == original_data
+
+
+def test_read_archive_encrypted_tampered(mocker, db_session):
+    """Tests that tampered encrypted archive raises ValueError."""
+    from app.db import models
+
+    db_session.add(
+        models.SystemSetting(key="secrets", value='{"cloud-enc": "my-passphrase-123"}')
+    )
+    db_session.commit()
+
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider(
+        {
+            "bucket_name": "b",
+            "encryption_secret_name": "cloud-enc",
+        }
+    )
+
+    # Corrupt payload: valid structure but wrong ciphertext
+    fake_payload = os.urandom(16) + os.urandom(12) + os.urandom(16) + b"garbage"
+
+    provider.s3.get_object = MagicMock(
+        return_value={
+            "Body": io.BytesIO(fake_payload),
+            "Metadata": {"tapehoard-encrypted": "v2-gcm"},
+        }
+    )
+
+    with pytest.raises(ValueError, match="tampering detected"):
+        provider.read_archive("M1", "archives/bad.tar")
+
+
+# ── Encryption Round-Trip ──
+
+
+def test_write_and_read_archive_encrypted(mocker, db_session):
+    """End-to-end test: write encrypted archive, read it back."""
+    from app.db import models
+
+    db_session.add(
+        models.SystemSetting(key="secrets", value='{"cloud-enc": "my-passphrase-123"}')
+    )
+    db_session.commit()
+
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider(
+        {
+            "bucket_name": "b",
+            "encryption_secret_name": "cloud-enc",
+            "obfuscate_filenames": False,
+        }
+    )
+
+    # Capture the uploaded payload
+    uploaded = {}
+
+    def capture_put_object(**kwargs):
+        uploaded["key"] = kwargs["Key"]
+        uploaded["body"] = kwargs["Body"]
+        uploaded["metadata"] = kwargs.get("Metadata", {})
+
+    provider.s3.put_object = MagicMock(side_effect=capture_put_object)
+
+    original = b"round-trip test data"
+    location = provider.write_archive("M1", io.BytesIO(original))
+
+    # Verify upload happened with encryption metadata
+    assert uploaded["metadata"].get("x-amz-meta-tapehoard-encrypted") == "v2-gcm"
+    assert uploaded["metadata"].get("x-amz-meta-tapehoard-type") == "archive"
+
+    # Now read it back
+    provider.s3.get_object = MagicMock(
+        return_value={
+            "Body": io.BytesIO(uploaded["body"]),
+            "Metadata": {"tapehoard-encrypted": "v2-gcm"},
+        }
+    )
+
+    result = provider.read_archive("M1", location)
+    assert result.read() == original
+
+
+# ── Misc ──
+
+
+def test_get_name(mocker):
+    """Tests get_name returns provider type string."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b", "provider": "Wasabi"})
+    assert provider.get_name() == "Cloud (Wasabi)"
+
+
+def test_finalize_media(mocker):
+    """Tests finalize_media is a no-op that logs."""
+    mocker.patch("app.providers.cloud.boto3")
+    provider = CloudStorageProvider({"bucket_name": "b"})
+    # Should not raise
+    provider.finalize_media("M1")
@@ -150,8 +150,31 @@ def test_run_backup_mocked(db_session, mocker, tmp_path):

    # Verify result
    db_session.expire_all()
+
+    # Media usage updated
    assert media.bytes_used > 0

+    # FileVersion recorded for the archived file
+    version = (
+        db_session.query(models.FileVersion)
+        .filter_by(filesystem_state_id=f1.id)
+        .first()
+    )
+    assert version is not None
+    assert version.media_id == media.id
+    assert version.offset_start == 0
+    assert version.offset_end == f1.size
+
+    # Backup job completed successfully
+    refreshed_job = db_session.get(models.Job, job.id)
+    assert refreshed_job.status == "COMPLETED"
+    assert refreshed_job.progress == 100.0
+
+    # Provider was asked to write the archive
+    mock_provider.write_archive.assert_called_once()
+    call_args = mock_provider.write_archive.call_args
+    assert call_args[0][0] == "DISK_001"  # media identifier
+

 def test_archiver_saturated_media_logic(db_session, mocker, tmp_path):
    """Verifies that media is marked full and priority ceded based on hardware feedback."""
@@ -436,6 +459,82 @@ def test_run_restore_mocked(db_session, mocker, tmp_path):
    assert expected_file.read_bytes() == b"hello"


+def test_backup_checkpoints_per_chunk(db_session, mocker, tmp_path):
+    """Verifies that FileVersions are committed after each chunk so a mid-job
+    failure doesn't orphan already-written archives on tape."""
+    staging = tmp_path / "staging"
+    staging.mkdir()
+    archiver = ArchiverService(staging_directory=str(staging))
+
+    # Capacity 10GB -> MAX_CHUNK_SIZE = 100MB
+    media = models.StorageMedia(
+        media_type="tape",
+        identifier="TAPE_CHK",
+        capacity=10 * 1024 * 1024 * 1024,
+        status="active",
+        bytes_used=0,
+    )
+    db_session.add(media)
+
+    # Create two tiny source files, but lie about their size to force chunking
+    files = []
+    for i in range(2):
+        source_file = tmp_path / f"chunk_{i}.bin"
+        source_file.write_bytes(b"0")
+        f = models.FilesystemState(
+            file_path=str(source_file),
+            size=60 * 1024 * 1024,  # 60MB
+            mtime=1,
+            sha256_hash=f"hash_{i}",
+        )
+        db_session.add(f)
+        files.append(f)
+    db_session.commit()
+
+    # Make the staging tar appear to be 60MB so bytes_used updates correctly
+    mocker.patch("os.path.getsize", return_value=60 * 1024 * 1024)
+
+    mock_provider = mocker.MagicMock()
+    mock_provider.capabilities = {"supports_random_access": False}
+    mock_provider.identify_media.return_value = "TAPE_CHK"
+    mock_provider.prepare_for_write.return_value = True
+    # First chunk succeeds, second chunk fails
+    mock_provider.write_archive.side_effect = ["ARCH_1", Exception("Tape write failed")]
+
+    mocker.patch.object(archiver, "_get_storage_provider", return_value=mock_provider)
+
+    from app.services.scanner import JobManager
+
+    job = JobManager.create_job(db_session, "BACKUP")
+
+    archiver.run_backup(db_session, media.id, job.id)
+
+    # Verify job failed
+    db_session.expire_all()
+    refreshed_job = db_session.get(models.Job, job.id)
+    assert refreshed_job.status == "FAILED"
+
+    # Verify first chunk was checkpointed
+    versions = (
+        db_session.query(models.FileVersion)
+        .filter_by(filesystem_state_id=files[0].id)
+        .all()
+    )
+    assert len(versions) == 1
+    assert versions[0].file_number == "ARCH_1"
+
+    # Verify second chunk was NOT checkpointed
+    versions_2 = (
+        db_session.query(models.FileVersion)
+        .filter_by(filesystem_state_id=files[1].id)
+        .all()
+    )
+    assert len(versions_2) == 0
+
+    # Verify media bytes_used reflects only first chunk
+    assert media.bytes_used == 60 * 1024 * 1024
+
+
 def test_cancelled_backup_job_status(db_session, mocker, tmp_path):
    """Verifies that a cancelled backup job never calls complete_job."""
    staging = tmp_path / "staging"
@@ -1,10 +1,9 @@
 import hashlib
 from datetime import datetime, timezone
+
 from app.services.scanner import (
    ScannerService,
    JobManager,
-    _hash_file_batch_fast,
-    _FAST_HASH_BINARY,
 )
 from app.db import models

@@ -115,9 +114,6 @@ def test_scan_sources_mocked(db_session, mocker):
    """Tests the discovery scan with mocked filesystem."""
    scanner = ScannerService()

-    # Disable fast find so the test uses the os.walk fallback path
-    mocker.patch("app.services.scanner._FAST_FIND_BINARY", None)
-
    # Mock settings
    mocker.patch("app.api.common.get_source_roots", return_value=["/mock_source"])
    mocker.patch("app.api.common.get_exclusion_spec", return_value=None)
@@ -143,52 +139,10 @@ def test_scan_sources_mocked(db_session, mocker):
    assert record.size == 500


-def test_hash_file_batch_fast(tmp_path):
-    """Tests native sha256sum/shasum batch hashing if available."""
-    if _FAST_HASH_BINARY is None:
-        # Skip if no native hash binary is available
-        return
-
-    # Create test files
-    files = {}
-    for i in range(5):
-        content = f"test content {i}".encode()
-        f = tmp_path / f"file_{i}.txt"
-        f.write_bytes(content)
-        files[str(f)] = hashlib.sha256(content).hexdigest()
-
-    # Hash via native binary
-    results = _hash_file_batch_fast(list(files.keys()), _FAST_HASH_BINARY)
-
-    assert len(results) == 5
-    for path, expected_hash in files.items():
-        assert results[path] == expected_hash
-
-
-def test_hash_file_batch_fast_empty():
-    """Tests that empty batch returns empty results."""
-    if _FAST_HASH_BINARY is None:
-        return
-
-    results = _hash_file_batch_fast([], _FAST_HASH_BINARY)
-    assert results == {}
-
-
-def test_hash_file_batch_fast_nonexistent():
-    """Tests that non-existent files are gracefully handled."""
-    if _FAST_HASH_BINARY is None:
-        return
-
-    results = _hash_file_batch_fast(["/nonexistent/path"], _FAST_HASH_BINARY)
-    # Non-existent files may or may not appear in results depending on binary behavior
-    assert isinstance(results, dict)
-
-
 def test_missing_file_marked_deleted_at_end_of_scan(db_session, mocker):
    """Tests that files not seen during a scan are marked as deleted."""
    scanner = ScannerService()

-    mocker.patch("app.services.scanner._FAST_FIND_BINARY", None)
    mocker.patch("app.api.common.get_source_roots", return_value=["/mock_source"])
    mocker.patch("app.api.common.get_exclusion_spec", return_value=None)
    mocker.patch("os.walk", return_value=[])
@@ -222,10 +176,7 @@ def test_missing_file_marked_deleted_at_end_of_scan(db_session, mocker):
 def test_existing_file_not_marked_deleted(db_session, mocker):
    """Tests that files found during scan retain is_deleted=False."""
    scanner = ScannerService()
-    print(f"DEBUG test_existing: scanner.is_running = {scanner.is_running}")
-    print(f"DEBUG test_existing: scanner.is_hashing = {scanner.is_hashing}")

-    mocker.patch("app.services.scanner._FAST_FIND_BINARY", None)
    mocker.patch("app.api.common.get_source_roots", return_value=["/mock_source"])
    mocker.patch("app.api.common.get_exclusion_spec", return_value=None)
    mocker.patch("os.path.exists", return_value=True)
@@ -260,8 +211,6 @@ def test_missing_file_during_hashing_marked_deleted(db_session, mocker):
    """Tests that files missing during hashing are marked as deleted."""
    scanner = ScannerService()

-    mocker.patch("app.services.scanner._FAST_HASH_BINARY", None)
-
    f = models.FilesystemState(
        file_path="/data/vanished.bin", size=10, mtime=1, is_ignored=False
    )
@@ -276,8 +225,11 @@ def test_missing_file_during_hashing_marked_deleted(db_session, mocker):
    assert f.is_deleted is True


-def test_missing_file_skipped_in_hashing_query(db_session):
-    """Tests that already-deleted files are excluded from hashing targets."""
+def test_deleted_files_excluded_from_hashing(db_session):
+    """Tests that run_hashing skips already-deleted files."""
+    scanner = ScannerService()
+    scanner.is_running = False  # Causes run_hashing to exit when no targets found
+
    deleted_file = models.FilesystemState(
        file_path="/data/deleted.bin",
        size=10,
@@ -289,13 +241,13 @@ def test_missing_file_skipped_in_hashing_query(db_session):
    db_session.add(deleted_file)
    db_session.commit()

-    pending = (
-        db_session.query(models.FilesystemState)
-        .filter(
-            models.FilesystemState.sha256_hash.is_(None),
-            models.FilesystemState.is_ignored.is_(False),
-            models.FilesystemState.is_deleted.is_(False),
-        )
-        .all()
-    )
-    assert len(pending) == 0
+    scanner.run_hashing()
+
+    # Deleted file should not have been processed (hash still None)
+    db_session.refresh(deleted_file)
+    assert deleted_file.sha256_hash is None
+
+    # A HASH job should have been created and completed (no work to do)
+    job = db_session.query(models.Job).filter_by(job_type="HASH").first()
+    assert job is not None
+    assert job.status == "COMPLETED"
@@ -0,0 +1,135 @@
+from app.services.scheduler import SchedulerService
+from app.db import models
+
+
+def test_scheduler_start_stop():
+    """Tests scheduler lifecycle (start, stop, idempotent)."""
+    scheduler = SchedulerService()
+    assert not scheduler.scheduler.running
+
+    scheduler.start()
+    assert scheduler.scheduler.running
+
+    # Idempotent start
+    scheduler.start()
+    assert scheduler.scheduler.running
+
+    scheduler.stop()
+    assert not scheduler.scheduler.running
+
+    # Idempotent stop
+    scheduler.stop()
+    assert not scheduler.scheduler.running
+
+
+def test_scheduler_load_schedules_empty():
+    """Tests load_schedules with no cron settings configured."""
+    scheduler = SchedulerService()
+    scheduler.start()
+
+    scheduler.load_schedules()
+
+    # No jobs should be registered
+    assert scheduler.scheduler.get_job("system_scan") is None
+    assert scheduler.scheduler.get_job("system_archival") is None
+
+    scheduler.stop()
+
+
+def test_scheduler_load_schedules_with_scan(db_session):
+    """Tests load_schedules picks up a scan schedule from settings."""
+    db_session.add(models.SystemSetting(key="schedule_scan", value="0 2 * * *"))
+    db_session.commit()
+
+    scheduler = SchedulerService()
+    scheduler.start()
+
+    scheduler.load_schedules()
+
+    job = scheduler.scheduler.get_job("system_scan")
+    assert job is not None
+    assert job.id == "system_scan"
+
+    scheduler.stop()
+
+
+def test_scheduler_add_remove_job():
+    """Tests adding and removing scheduled jobs."""
+    scheduler = SchedulerService()
+    scheduler.start()
+
+    def dummy_job():
+        pass
+
+    scheduler.add_job("test_job", dummy_job, "0 0 * * *")
+    assert scheduler.scheduler.get_job("test_job") is not None
+
+    scheduler.remove_job("test_job")
+    assert scheduler.scheduler.get_job("test_job") is None
+
+    # Idempotent remove
+    scheduler.remove_job("test_job")
+    assert scheduler.scheduler.get_job("test_job") is None
+
+    scheduler.stop()
+
+
+def test_scheduler_add_job_empty_cron():
+    """Tests that empty/whitespace cron expression removes the job."""
+    scheduler = SchedulerService()
+    scheduler.start()
+
+    def dummy_job():
+        pass
+
+    scheduler.add_job("test_job", dummy_job, "0 0 * * *")
+    assert scheduler.scheduler.get_job("test_job") is not None
+
+    # Empty string should remove
+    scheduler.add_job("test_job", dummy_job, "  ")
+    assert scheduler.scheduler.get_job("test_job") is None
+
+    scheduler.stop()
+
+
+def test_scheduler_reload(db_session, mocker):
+    """Tests reload calls load_schedules."""
+    db_session.add(models.SystemSetting(key="schedule_scan", value="0 3 * * *"))
+    db_session.commit()
+
+    scheduler = SchedulerService()
+    scheduler.start()
+
+    load_spy = mocker.spy(scheduler, "load_schedules")
+    scheduler.reload()
+
+    load_spy.assert_called_once()
+
+    job = scheduler.scheduler.get_job("system_scan")
+    assert job is not None
+
+    scheduler.stop()
+
+
+def test_scheduler_run_system_scan_skips_when_running(mocker):
+    """Tests run_system_scan is skipped when scanner_manager is already running."""
+    scheduler = SchedulerService()
+
+    mocker.patch("app.services.scheduler.scanner_manager.is_running", True)
+    scan_sources_spy = mocker.patch(
+        "app.services.scheduler.scanner_manager.scan_sources"
+    )
+
+    scheduler.run_system_scan()
+    scan_sources_spy.assert_not_called()
+
+
+def test_scheduler_run_system_archival_no_online_media(db_session, mocker):
+    """Tests run_system_archival skips when no active media is online."""
+    scheduler = SchedulerService()
+
+    # No media in DB
+    run_backup_spy = mocker.patch("app.services.scheduler.archiver_manager.run_backup")
+
+    scheduler.run_system_archival()
+    run_backup_spy.assert_not_called()
@@ -2,7 +2,7 @@

 import type { Client, Options as Options2, TDataShape } from './client';
 import { client } from './client.gen';
-import type { AddDirectoryToRestoreQueueData, AddDirectoryToRestoreQueueErrors, AddDirectoryToRestoreQueueResponses, AddFileToRestoreQueueData, AddFileToRestoreQueueErrors, AddFileToRestoreQueueResponses, ArchiveBrowseData, ArchiveBrowseErrors, ArchiveBrowseResponses, ArchiveMetadataData, ArchiveMetadataErrors, ArchiveMetadataResponses, ArchiveSearchData, ArchiveSearchErrors, ArchiveSearchResponses, ArchiveTreeData, ArchiveTreeErrors, ArchiveTreeResponses, BatchAddToRestoreQueueData, BatchAddToRestoreQueueErrors, BatchAddToRestoreQueueResponses, BatchConfirmDiscrepanciesData, BatchConfirmDiscrepanciesErrors, BatchConfirmDiscrepanciesResponses, BatchDeleteDiscrepanciesData, BatchDeleteDiscrepanciesErrors, BatchDeleteDiscrepanciesResponses, BatchDismissDiscrepanciesData, BatchDismissDiscrepanciesErrors, BatchDismissDiscrepanciesResponses, BatchResolveDiscrepanciesData, BatchResolveDiscrepanciesErrors, BatchResolveDiscrepanciesResponses, BatchTrackData, BatchTrackErrors, BatchTrackResponses, BrowseDiscrepanciesData, BrowseDiscrepanciesErrors, BrowseDiscrepanciesResponses, BrowseRestoreQueueData, BrowseRestoreQueueErrors, BrowseRestoreQueueResponses, CancelJobData, CancelJobErrors, CancelJobResponses, CheckHealthData, CheckHealthResponses, ClearRestoreQueueData, ClearRestoreQueueResponses, ConfirmDiscrepancyData, ConfirmDiscrepancyErrors, ConfirmDiscrepancyResponses, CreateMediaData, CreateMediaErrors, CreateMediaResponses, CreateSecretData, CreateSecretErrors, CreateSecretResponses, DeleteDiscrepancyData, DeleteDiscrepancyErrors, DeleteDiscrepancyResponses, DeleteMediaData, DeleteMediaErrors, DeleteMediaResponses, DeleteSecretData, DeleteSecretErrors, DeleteSecretResponses, DetectMediaData, DetectMediaResponses, DiscoverHardwareData, DiscoverHardwareResponses, DismissDiscrepancyData, DismissDiscrepancyErrors, DismissDiscrepancyResponses, DownloadExclusionReportData, DownloadExclusionReportErrors, DownloadExclusionReportResponses, ExportDatabaseData, ExportDatabaseResponses, FilesystemBrowseData, FilesystemBrowseErrors, FilesystemBrowseResponses, FilesystemSearchData, FilesystemSearchErrors, FilesystemSearchResponses, FilesystemTreeData, FilesystemTreeErrors, FilesystemTreeResponses, GetAnalyticsData, GetAnalyticsResponses, GetDashboardStatsData, GetDashboardStatsResponses, GetDiscrepancyTreeData, GetDiscrepancyTreeErrors, GetDiscrepancyTreeResponses, GetJobCountData, GetJobCountResponses, GetJobData, GetJobErrors, GetJobLogsData, GetJobLogsErrors, GetJobLogsResponses, GetJobResponses, GetJobStatsData, GetJobStatsResponses, GetRestoreManifestData, GetRestoreManifestResponses, GetRestoreQueueData, GetRestoreQueueResponses, GetRestoreQueueTreeData, GetRestoreQueueTreeErrors, GetRestoreQueueTreeResponses, GetScanStatusData, GetScanStatusResponses, GetSecretData, GetSecretErrors, GetSecretResponses, GetSettingsData, GetSettingsResponses, GetTreemapData, GetTreemapResponses, IgnoreHardwareData, IgnoreHardwareErrors, IgnoreHardwareResponses, ImportDatabaseData, ImportDatabaseErrors, ImportDatabaseResponses, InitializeMediaData, InitializeMediaErrors, InitializeMediaResponses, ListBackupsData, ListBackupsResponses, ListDirectoriesData, ListDirectoriesErrors, ListDirectoriesResponses, ListDiscrepanciesData, ListDiscrepanciesResponses, ListJobsData, ListJobsErrors, ListJobsResponses, ListMediaData, ListMediaErrors, ListMediaResponses, ListProvidersData, ListProvidersResponses, ListSecretsData, ListSecretsResponses, RemoveFromRestoreQueueData, RemoveFromRestoreQueueErrors, RemoveFromRestoreQueueResponses, ReorderMediaData, ReorderMediaErrors, ReorderMediaResponses, ResetTestEnvironmentData, ResetTestEnvironmentResponses, RetryJobData, RetryJobErrors, RetryJobResponses, StreamJobsData, StreamJobsResponses, TestExclusionsData, TestExclusionsErrors, TestExclusionsResponses, TestNotificationData, TestNotificationErrors, TestNotificationResponses, TriggerAutoBackupData, TriggerAutoBackupResponses, TriggerBackupData, TriggerBackupErrors, TriggerBackupResponses, TriggerIndexingData, TriggerIndexingResponses, TriggerRestoreData, TriggerRestoreErrors, TriggerRestoreResponses, TriggerScanData, TriggerScanResponses, UndoDismissDiscrepancyData, UndoDismissDiscrepancyErrors, UndoDismissDiscrepancyResponses, UpdateMediaData, UpdateMediaErrors, UpdateMediaResponses, UpdateSettingsData, UpdateSettingsErrors, UpdateSettingsResponses } from './types.gen';
+import type { AddDirectoryToRestoreQueueData, AddDirectoryToRestoreQueueErrors, AddDirectoryToRestoreQueueResponses, AddFileToRestoreQueueData, AddFileToRestoreQueueErrors, AddFileToRestoreQueueResponses, ArchiveBrowseData, ArchiveBrowseErrors, ArchiveBrowseResponses, ArchiveMetadataData, ArchiveMetadataErrors, ArchiveMetadataResponses, ArchiveSearchData, ArchiveSearchErrors, ArchiveSearchResponses, ArchiveTreeData, ArchiveTreeErrors, ArchiveTreeResponses, BatchAddToRestoreQueueData, BatchAddToRestoreQueueErrors, BatchAddToRestoreQueueResponses, BatchConfirmDiscrepanciesData, BatchConfirmDiscrepanciesErrors, BatchConfirmDiscrepanciesResponses, BatchDeleteDiscrepanciesData, BatchDeleteDiscrepanciesErrors, BatchDeleteDiscrepanciesResponses, BatchDismissDiscrepanciesData, BatchDismissDiscrepanciesErrors, BatchDismissDiscrepanciesResponses, BatchResolveDiscrepanciesData, BatchResolveDiscrepanciesErrors, BatchResolveDiscrepanciesResponses, BatchTrackData, BatchTrackErrors, BatchTrackResponses, BrowseDiscrepanciesData, BrowseDiscrepanciesErrors, BrowseDiscrepanciesResponses, BrowseRestoreQueueData, BrowseRestoreQueueErrors, BrowseRestoreQueueResponses, CancelJobData, CancelJobErrors, CancelJobResponses, CheckHealthData, CheckHealthResponses, ClearRestoreQueueData, ClearRestoreQueueResponses, ConfirmDiscrepancyData, ConfirmDiscrepancyErrors, ConfirmDiscrepancyResponses, CreateMediaData, CreateMediaErrors, CreateMediaResponses, CreateSecretData, CreateSecretErrors, CreateSecretResponses, DeleteDiscrepancyData, DeleteDiscrepancyErrors, DeleteDiscrepancyResponses, DeleteMediaData, DeleteMediaErrors, DeleteMediaResponses, DeleteSecretData, DeleteSecretErrors, DeleteSecretResponses, DetectMediaData, DetectMediaResponses, DiscoverHardwareData, DiscoverHardwareResponses, DismissDiscrepancyData, DismissDiscrepancyErrors, DismissDiscrepancyResponses, DownloadExclusionReportData, DownloadExclusionReportErrors, DownloadExclusionReportResponses, ExportDatabaseData, ExportDatabaseResponses, FilesystemBrowseData, FilesystemBrowseErrors, FilesystemBrowseResponses, FilesystemSearchData, FilesystemSearchErrors, FilesystemSearchResponses, FilesystemTreeData, FilesystemTreeErrors, FilesystemTreeResponses, GetAnalyticsData, GetAnalyticsResponses, GetDashboardStatsData, GetDashboardStatsResponses, GetDiscrepancyTreeData, GetDiscrepancyTreeErrors, GetDiscrepancyTreeResponses, GetJobCountData, GetJobCountResponses, GetJobData, GetJobErrors, GetJobLogsData, GetJobLogsErrors, GetJobLogsResponses, GetJobResponses, GetJobStatsData, GetJobStatsResponses, GetRestoreManifestData, GetRestoreManifestResponses, GetRestoreQueueData, GetRestoreQueueResponses, GetRestoreQueueTreeData, GetRestoreQueueTreeErrors, GetRestoreQueueTreeResponses, GetScanStatusData, GetScanStatusResponses, GetSecretData, GetSecretErrors, GetSecretResponses, GetSettingsData, GetSettingsResponses, GetStagingInfoData, GetStagingInfoResponses, GetTreemapData, GetTreemapResponses, IgnoreHardwareData, IgnoreHardwareErrors, IgnoreHardwareResponses, ImportDatabaseData, ImportDatabaseErrors, ImportDatabaseResponses, InitializeMediaData, InitializeMediaErrors, InitializeMediaResponses, ListBackupsData, ListBackupsResponses, ListDirectoriesData, ListDirectoriesErrors, ListDirectoriesResponses, ListDiscrepanciesData, ListDiscrepanciesResponses, ListJobsData, ListJobsErrors, ListJobsResponses, ListMediaData, ListMediaErrors, ListMediaResponses, ListProvidersData, ListProvidersResponses, ListSecretsData, ListSecretsResponses, RemoveFromRestoreQueueData, RemoveFromRestoreQueueErrors, RemoveFromRestoreQueueResponses, ReorderMediaData, ReorderMediaErrors, ReorderMediaResponses, ResetTestEnvironmentData, ResetTestEnvironmentResponses, RetryJobData, RetryJobErrors, RetryJobResponses, StreamJobsData, StreamJobsResponses, TestExclusionsData, TestExclusionsErrors, TestExclusionsResponses, TestNotificationData, TestNotificationErrors, TestNotificationResponses, TriggerAutoBackupData, TriggerAutoBackupResponses, TriggerBackupData, TriggerBackupErrors, TriggerBackupResponses, TriggerIndexingData, TriggerIndexingResponses, TriggerRestoreData, TriggerRestoreErrors, TriggerRestoreResponses, TriggerScanData, TriggerScanResponses, UndoDismissDiscrepancyData, UndoDismissDiscrepancyErrors, UndoDismissDiscrepancyResponses, UpdateMediaData, UpdateMediaErrors, UpdateMediaResponses, UpdateSettingsData, UpdateSettingsErrors, UpdateSettingsResponses } from './types.gen';

 export type Options<TData extends TDataShape = TDataShape, ThrowOnError extends boolean = boolean, TResponse = unknown> = Options2<TData, ThrowOnError, TResponse> & {
    /**
@@ -32,6 +32,13 @@ export const resetTestEnvironment = <ThrowOnError extends boolean = false>(optio
 */
 export const getDashboardStats = <ThrowOnError extends boolean = false>(options?: Options<GetDashboardStatsData, ThrowOnError>) => (options?.client ?? client).get<GetDashboardStatsResponses, unknown, ThrowOnError>({ url: '/system/dashboard/stats', ...options });

+/**
+ * Get Staging Info
+ *
+ * Returns disk usage information for the backup staging directory.
+ */
+export const getStagingInfo = <ThrowOnError extends boolean = false>(options?: Options<GetStagingInfoData, ThrowOnError>) => (options?.client ?? client).get<GetStagingInfoResponses, unknown, ThrowOnError>({ url: '/system/staging/info', ...options });
+
 /**
 * List Jobs
 *
@@ -53,6 +60,13 @@ export const getJobCount = <ThrowOnError extends boolean = false>(options?: Opti
 */
 export const getJobStats = <ThrowOnError extends boolean = false>(options?: Options<GetJobStatsData, ThrowOnError>) => (options?.client ?? client).get<GetJobStatsResponses, unknown, ThrowOnError>({ url: '/system/jobs/stats', ...options });

+/**
+ * Stream Jobs
+ *
+ * Server-Sent Events (SSE) endpoint for real-time job status updates.
+ */
+export const streamJobs = <ThrowOnError extends boolean = false>(options?: Options<StreamJobsData, ThrowOnError>) => (options?.client ?? client).get<StreamJobsResponses, unknown, ThrowOnError>({ url: '/system/jobs/stream', ...options });
+
 /**
 * Get Job
 *
@@ -81,13 +95,6 @@ export const cancelJob = <ThrowOnError extends boolean = false>(options: Options
 */
 export const retryJob = <ThrowOnError extends boolean = false>(options: Options<RetryJobData, ThrowOnError>) => (options.client ?? client).post<RetryJobResponses, RetryJobErrors, ThrowOnError>({ url: '/system/jobs/{job_id}/retry', ...options });

-/**
- * Stream Jobs
- *
- * Server-Sent Events (SSE) endpoint for real-time job status updates.
- */
-export const streamJobs = <ThrowOnError extends boolean = false>(options?: Options<StreamJobsData, ThrowOnError>) => (options?.client ?? client).get<StreamJobsResponses, unknown, ThrowOnError>({ url: '/system/jobs/stream', ...options });
-
 /**
 * Trigger Scan
 *
@@ -1106,6 +1106,28 @@ export type SettingSchema = {
    value: string;
 };

+/**
+ * StagingInfoSchema
+ */
+export type StagingInfoSchema = {
+    /**
+     * Path
+     */
+    path: string;
+    /**
+     * Total Bytes
+     */
+    total_bytes: number;
+    /**
+     * Used Bytes
+     */
+    used_bytes: number;
+    /**
+     * Free Bytes
+     */
+    free_bytes: number;
+};
+
 /**
 * StorageProviderSchema
 */
@@ -1338,6 +1360,22 @@ export type GetDashboardStatsResponses = {

 export type GetDashboardStatsResponse = GetDashboardStatsResponses[keyof GetDashboardStatsResponses];

+export type GetStagingInfoData = {
+    body?: never;
+    path?: never;
+    query?: never;
+    url: '/system/staging/info';
+};
+
+export type GetStagingInfoResponses = {
+    /**
+     * Successful Response
+     */
+    200: StagingInfoSchema;
+};
+
+export type GetStagingInfoResponse = GetStagingInfoResponses[keyof GetStagingInfoResponses];
+
 export type ListJobsData = {
    body?: never;
    path?: never;
@@ -1402,6 +1440,20 @@ export type GetJobStatsResponses = {
    200: unknown;
 };

+export type StreamJobsData = {
+    body?: never;
+    path?: never;
+    query?: never;
+    url: '/system/jobs/stream';
+};
+
+export type StreamJobsResponses = {
+    /**
+     * Successful Response
+     */
+    200: unknown;
+};
+
 export type GetJobData = {
    body?: never;
    path: {
@@ -1520,20 +1572,6 @@ export type RetryJobResponses = {
    200: unknown;
 };

-export type StreamJobsData = {
-    body?: never;
-    path?: never;
-    query?: never;
-    url: '/system/jobs/stream';
-};
-
-export type StreamJobsResponses = {
-    /**
-     * Successful Response
-     */
-    200: unknown;
-};
-
 export type TriggerScanData = {
    body?: never;
    path?: never;
@@ -52,8 +52,10 @@
        ignoreHardware,
        listProviders,
        listSecrets,
+        getStagingInfo,
        type MediaSchema,
-        type StorageProviderSchema
+        type StorageProviderSchema,
+        type StagingInfoSchema
    } from '$lib/api';
    import { LTO_CAPACITY, PROVIDER_TEMPLATES, type LtoTapeCreateData, type OfflineHddCreateData, type CloudCreateData } from '$lib/types';
    import { dndzone } from 'svelte-dnd-action';
@@ -67,6 +69,7 @@
    let loading = $state(true);
    let showRegisterDialog = $state(false);
    let editingMedia = $state<MediaSchema | null>(null);
+    let stagingInfo = $state<StagingInfoSchema | null>(null);

    let activeMedia = $derived(mediaList.filter(m => m.status === 'active'));
    let fullMedia = $derived(mediaList.filter(m => m.status === 'full'));
@@ -243,6 +246,38 @@
        }
    }

+    async function pollHardware() {
+        try {
+            const res = await discoverHardware();
+            if (!res.data) return;
+
+            const prevPaths = new Set(discoveredAssets.map(a => a.device_path));
+            let hasNew = false;
+
+            discoveredAssets = (res.data as any[]).map(newAsset => {
+                if (!prevPaths.has(newAsset.device_path)) {
+                    hasNew = true;
+                }
+                const oldAsset = discoveredAssets.find(a => a.device_path === newAsset.device_path);
+                if (oldAsset && oldAsset.hardware_info && newAsset.hardware_info) {
+                     if (Object.keys(newAsset.hardware_info.tape || {}).length === 0 && Object.keys(oldAsset.hardware_info.tape || {}).length > 0) {
+                        newAsset.hardware_info.tape = oldAsset.hardware_info.tape;
+                    }
+                    if (Object.keys(newAsset.hardware_info.drive || {}).length === 0 && Object.keys(oldAsset.hardware_info.drive || {}).length > 0) {
+                        newAsset.hardware_info.drive = oldAsset.hardware_info.drive;
+                    }
+                }
+                return newAsset;
+            });
+
+            if (hasNew) {
+                loadMedia(true, true);
+            }
+        } catch (error) {
+            console.error("Hardware discovery failed:", error);
+        }
+    }
+
    let prevOnlineCount = $state(0);

    $effect(() => {
@@ -263,10 +298,20 @@
        }
    }

+    async function loadStagingInfo() {
+        try {
+            const res = await getStagingInfo();
+            if (res.data) stagingInfo = res.data;
+        } catch (error) {
+            console.error("Failed to load staging info:", error);
+        }
+    }
+
    onMount(async () => {
        // Initial load (non-silent and forced refresh to show live hardware status immediately)
        loadMedia(false, true);
        loadSecrets();
+        loadStagingInfo();

        try {
            const res = await listProviders();
@@ -275,7 +320,10 @@
            console.error("Failed to load storage providers:", error);
        }

-        pollInterval = setInterval(() => loadMedia(true), POLL_SLOW);
+        pollInterval = setInterval(() => {
+            pollHardware();
+            loadStagingInfo();
+        }, POLL_SLOW);
    });

    onDestroy(() => {
@@ -1095,7 +1143,7 @@
                            <label class="text-xs font-medium text-text-secondary ml-1" for="identifier">
                                {newMedia.media_type === 'lto_tape' ? 'Barcode' : newMedia.media_type === 'local_hdd' ? 'Identifier / Serial' : 'Friendly Name'}
                            </label>
-                            <Input id="identifier" bind:value={newMedia.identifier} placeholder={newMedia.media_type === 'lto_tape' ? 'BUP-00001' : newMedia.media_type === 'local_hdd' ? 'Samsung-T7-001' : 'AWS-Production'} class="h-10 bg-bg-primary/50 border-border-color font-mono text-sm" />
+                            <Input id="identifier" bind:value={newMedia.identifier} placeholder={newMedia.media_type === 'lto_tape' ? 'TAPE01' : newMedia.media_type === 'local_hdd' ? 'Samsung-T7-001' : 'AWS-Production'} class="h-10 bg-bg-primary/50 border-border-color font-mono text-sm" />
                        </div>

                        {#if newMedia.media_type === 'lto_tape'}
@@ -1232,27 +1280,9 @@
                    <h3 class="text-xs font-semibold text-text-secondary uppercase tracking-wider">Configuration</h3>

                    {#if newMedia.media_type === 'lto_tape'}
-                        <div class="grid grid-cols-2 gap-4">
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="compression" type="checkbox" bind:checked={newMedia.compression} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="compression">Hardware Compression</label>
-                            </div>
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="worm" type="checkbox" bind:checked={newMedia.worm} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="worm">WORM (Write Once Read Many)</label>
-                            </div>
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="write_protected" type="checkbox" bind:checked={newMedia.write_protected} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="write_protected">Write Protected (Physical)</label>
-                            </div>
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="cleaning_cartridge" type="checkbox" bind:checked={newMedia.cleaning_cartridge} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="cleaning_cartridge">Cleaning Cartridge</label>
-                            </div>
-                        </div>
-                        <div class="space-y-2">
-                            <label class="text-xs font-medium text-text-secondary ml-1" for="encryption_key_id">Encryption Key ID</label>
-                            <Input id="encryption_key_id" bind:value={newMedia.encryption_key_id} placeholder="Key reference in system keystore" class="h-10 bg-bg-primary/50 border-border-color font-mono text-sm" />
+                        <div class="flex items-center gap-3 h-10 px-1">
+                            <input id="compression" type="checkbox" bind:checked={newMedia.compression} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
+                            <label class="text-xs font-medium text-text-secondary cursor-pointer" for="compression">Hardware Compression</label>
                        </div>
                        <div class="space-y-2">
                            <label class="text-xs font-medium text-text-secondary ml-1" for="lto-encryption_secret_name">Encryption Secret</label>
@@ -1268,48 +1298,9 @@
                            <p class="text-[10px] text-text-secondary leading-tight opacity-60">Manage secrets in <a href="/settings" class="text-blue-500 hover:underline">Settings</a>.</p>
                        </div>
                    {:else if newMedia.media_type === 'local_hdd'}
-                        <div class="grid grid-cols-2 gap-4">
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="is_ssd" type="checkbox" bind:checked={newMedia.is_ssd} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="is_ssd">SSD (Solid State Drive)</label>
-                            </div>
-                            <div class="flex items-center gap-3 h-10 px-1">
-                                <input id="encrypted" type="checkbox" bind:checked={newMedia.encrypted} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="encrypted">Drive Encrypted (BitLocker/LUKS)</label>
-                            </div>
-                        </div>
-                        <div class="grid grid-cols-2 gap-6">
-                            <div class="space-y-2">
-                                <label class="text-xs font-medium text-text-secondary ml-1" for="filesystem_type">Filesystem Type</label>
-                                <div class="relative">
-                                    <select id="filesystem_type" bind:value={newMedia.filesystem_type} class="w-full h-10 bg-bg-primary border border-border-color rounded-xl px-4 pr-10 text-sm font-medium text-text-primary outline-none focus:ring-2 focus:ring-blue-500/20 transition-all appearance-none cursor-pointer">
-                                        <option value="">Select...</option>
-                                        <option value="ext4">ext4</option>
-                                        <option value="NTFS">NTFS</option>
-                                        <option value="APFS">APFS</option>
-                                        <option value="exFAT">exFAT</option>
-                                    </select>
-                                    <ChevronDown size={16} class="absolute right-3 top-1/2 -translate-y-1/2 text-text-secondary pointer-events-none" />
-                                </div>
-                            </div>
-                            <div class="space-y-2">
-                                <label class="text-xs font-medium text-text-secondary ml-1" for="connection_interface">Connection Interface</label>
-                                <div class="relative">
-                                    <select id="connection_interface" bind:value={newMedia.connection_interface} class="w-full h-10 bg-bg-primary border border-border-color rounded-xl px-4 pr-10 text-sm font-medium text-text-primary outline-none focus:ring-2 focus:ring-blue-500/20 transition-all appearance-none cursor-pointer">
-                                        <option value="">Select...</option>
-                                        <option value="USB-A">USB-A</option>
-                                        <option value="USB-C">USB-C</option>
-                                        <option value="Thunderbolt">Thunderbolt</option>
-                                        <option value="SATA">SATA</option>
-                                        <option value="NVMe">NVMe</option>
-                                    </select>
-                                    <ChevronDown size={16} class="absolute right-3 top-1/2 -translate-y-1/2 text-text-secondary pointer-events-none" />
-                                </div>
-                            </div>
-                        </div>
-                        <div class="space-y-2">
-                            <label class="text-xs font-medium text-text-secondary ml-1" for="hdd_encryption_key_id">Encryption Key ID</label>
-                            <Input id="hdd_encryption_key_id" bind:value={newMedia.hdd_encryption_key_id} placeholder="Key reference in system keystore" class="h-10 bg-bg-primary/50 border-border-color font-mono text-sm" />
+                        <div class="flex items-center gap-3 h-10 px-1">
+                            <input id="is_ssd" type="checkbox" bind:checked={newMedia.is_ssd} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
+                            <label class="text-xs font-medium text-text-secondary cursor-pointer" for="is_ssd">SSD (Solid State Drive)</label>
                        </div>
                        <div class="space-y-2">
                            <label class="text-xs font-medium text-text-secondary ml-1" for="hdd-encryption_secret_name">Encryption Secret</label>
@@ -1435,23 +1426,9 @@
                    {#if editingMedia.media_type === 'lto_tape'}
                        <div class="space-y-4">
                            <h3 class="text-xs font-semibold text-text-secondary uppercase tracking-wider">LTO Configuration</h3>
-                            <div class="grid grid-cols-2 gap-4">
-                                <div class="flex items-center gap-3 h-10 px-1">
-                                    <input id="edit-compression" type="checkbox" bind:checked={editingMedia.compression} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-compression">Hardware Compression</label>
-                                </div>
-                                <div class="flex items-center gap-3 h-10 px-1">
-                                    <input id="edit-worm" type="checkbox" bind:checked={editingMedia.worm} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-worm">WORM</label>
-                                </div>
-                                <div class="flex items-center gap-3 h-10 px-1">
-                                    <input id="edit-write_protected" type="checkbox" bind:checked={editingMedia.write_protected} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-write_protected">Write Protected</label>
-                                </div>
-                                <div class="flex items-center gap-3 h-10 px-1">
-                                    <input id="edit-cleaning_cartridge" type="checkbox" bind:checked={editingMedia.cleaning_cartridge} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-cleaning_cartridge">Cleaning Cartridge</label>
-                                </div>
+                            <div class="flex items-center gap-3 h-10 px-1">
+                                <input id="edit-compression" type="checkbox" bind:checked={editingMedia.compression} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
+                                <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-compression">Hardware Compression</label>
                            </div>
                            <div class="space-y-2">
                                <label class="text-xs font-medium text-text-secondary ml-1" for="edit-lto-encryption_secret_name">Encryption Secret</label>
@@ -1482,10 +1459,6 @@
                                    <input id="edit-is_ssd" type="checkbox" bind:checked={editingMedia.is_ssd} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-is_ssd">SSD</label>
                                </div>
-                                <div class="flex items-center gap-3 h-10 px-1">
-                                    <input id="edit-encrypted" type="checkbox" bind:checked={editingMedia.encrypted} class="w-4 h-4 rounded border-border-color bg-bg-primary text-blue-600 focus:ring-blue-500/20" />
-                                    <label class="text-xs font-medium text-text-secondary cursor-pointer" for="edit-encrypted">Encrypted</label>
-                                </div>
                            </div>
                            <div class="space-y-2">
                                <label class="text-xs font-medium text-text-secondary ml-1" for="edit-hdd-encryption_secret_name">Encryption Secret</label>
@@ -20,7 +20,8 @@
        Upload,
        Terminal,
        Globe,
-        Key
+        Key,
+        ChevronDown
    } from "lucide-svelte";
    import { Button } from "$lib/components/ui/button";
    import PageHeader from "$lib/components/ui/PageHeader.svelte";
@@ -51,6 +52,7 @@
    let scanSchedule = $state("");
    let archivalSchedule = $state("");
    let notificationUrls = $state<string[]>([]);
+    let ioniceLevel = $state("idle");

    // Secrets keystore
    let secretsList = $state<string[]>([]);
@@ -66,7 +68,8 @@
        globalExclusions,
        scanSchedule,
        archivalSchedule,
-        notificationUrls
+        notificationUrls,
+        ioniceLevel
    }));

    beforeNavigate((navigation: any) => {
@@ -155,6 +158,7 @@
                if (data.schedule_scan) scanSchedule = data.schedule_scan;
                if (data.schedule_archival) archivalSchedule = data.schedule_archival;
                if (data.notification_urls) notificationUrls = JSON.parse(data.notification_urls);
+                if (data.ionice_level) ioniceLevel = data.ionice_level;
            }

            // Load secrets
@@ -169,7 +173,8 @@
                globalExclusions,
                scanSchedule,
                archivalSchedule,
-                notificationUrls
+                notificationUrls,
+                ioniceLevel
            });
        } catch (error) {
            toast.error("Failed to load system configuration");
@@ -188,7 +193,8 @@
                updateSettings({ body: { key: "global_exclusions", value: globalExclusions } }),
                updateSettings({ body: { key: "schedule_scan", value: scanSchedule } }),
                updateSettings({ body: { key: "schedule_archival", value: archivalSchedule } }),
-                updateSettings({ body: { key: "notification_urls", value: JSON.stringify(notificationUrls) } })
+                updateSettings({ body: { key: "notification_urls", value: JSON.stringify(notificationUrls) } }),
+                updateSettings({ body: { key: "ionice_level", value: ioniceLevel } })
            ]);

            // Snapshot saved state
@@ -199,7 +205,8 @@
                globalExclusions,
                scanSchedule,
                archivalSchedule,
-                notificationUrls
+                notificationUrls,
+                ioniceLevel
            });

            toast.success("System configuration committed");
@@ -648,6 +655,24 @@

                {:else if activeTab === 'system'}
                    <div class="animate-in slide-in-from-bottom-4 duration-500 space-y-6">
+                        <Card class="p-5 shadow-xl">
+                            <SectionHeader title="I/O scheduling" icon={Cpu} class="mb-6 px-0" />
+                            <div class="space-y-4">
+                                <div class="space-y-2">
+                                    <label class="text-xs font-medium text-text-secondary ml-1" for="ionice-level">Background job I/O priority</label>
+                                    <div class="relative">
+                                        <select id="ionice-level" bind:value={ioniceLevel} class="w-full h-10 bg-bg-primary border border-border-color rounded-xl px-4 pr-10 text-sm font-medium text-text-primary outline-none focus:ring-2 focus:ring-blue-500/20 transition-all appearance-none cursor-pointer">
+                                            <option value="idle">Idle (only use I/O when system is free)</option>
+                                            <option value="best-effort">Best-effort (normal scheduling)</option>
+                                            <option value="realtime">Real-time (highest priority, requires root)</option>
+                                        </select>
+                                        <ChevronDown size={16} class="absolute right-3 top-1/2 -translate-y-1/2 text-text-secondary pointer-events-none" />
+                                    </div>
+                                    <p class="text-[10px] text-text-secondary leading-tight opacity-60">Applies to scan and backup jobs. Idle is recommended for production systems.</p>
+                                </div>
+                            </div>
+                        </Card>
+
                        <Card class="p-5 shadow-xl">
                            <SectionHeader title="Index management" icon={Database} class="mb-6 px-0" />
                            <div class="grid grid-cols-2 gap-4">
Author	SHA1	Message	Date
adamlamers	1f2f29d28f	commit backup progress per chunk Continuous Integration / backend-tests (push) Successful in 1m9s Details Continuous Integration / frontend-check (push) Successful in 28s Details Continuous Integration / e2e-tests (push) Successful in 6m41s Details	2026-05-06 11:26:32 -04:00
adamlamers	8de46f538d	more readme notes Continuous Integration / backend-tests (push) Successful in 1m1s Details Continuous Integration / frontend-check (push) Successful in 20s Details Continuous Integration / e2e-tests (push) Successful in 11m20s Details	2026-05-05 23:47:45 -04:00
adamlamers	f5ed1adec4	not JUST tape Continuous Integration / backend-tests (push) Successful in 1m18s Details Continuous Integration / frontend-check (push) Successful in 22s Details Continuous Integration / e2e-tests (push) Successful in 8m18s Details	2026-05-05 23:39:58 -04:00
adamlamers	fb1ead7d63	new readme Continuous Integration / backend-tests (push) Successful in 1m42s Details Continuous Integration / frontend-check (push) Successful in 50s Details Continuous Integration / e2e-tests (push) Successful in 7m24s Details	2026-05-05 23:33:26 -04:00
adamlamers	f5ddfed38b	let user set ionice in settings Continuous Integration / backend-tests (push) Successful in 1m20s Details Continuous Integration / frontend-check (push) Successful in 51s Details Continuous Integration / e2e-tests (push) Successful in 6m55s Details	2026-05-05 22:07:30 -04:00
adamlamers	65860e0408	check staging area has enough capacity Continuous Integration / backend-tests (push) Successful in 39s Details Continuous Integration / frontend-check (push) Successful in 20s Details Continuous Integration / e2e-tests (push) Successful in 5m17s Details	2026-05-05 21:33:44 -04:00
adamlamers	32fc9e4506	always call sg_read_attr to try and read tape info Continuous Integration / backend-tests (push) Successful in 36s Details Continuous Integration / frontend-check (push) Successful in 15s Details Continuous Integration / e2e-tests (push) Successful in 5m8s Details	2026-05-05 20:59:34 -04:00
adamlamers	f40a76aa14	better scsi ready state checking Continuous Integration / backend-tests (push) Successful in 36s Details Continuous Integration / frontend-check (push) Successful in 16s Details Continuous Integration / e2e-tests (push) Successful in 5m51s Details	2026-05-05 20:51:41 -04:00
adamlamers	d398664e51	better hardware polling on media page Continuous Integration / backend-tests (push) Successful in 37s Details Continuous Integration / frontend-check (push) Successful in 16s Details Continuous Integration / e2e-tests (push) Successful in 6m9s Details	2026-05-05 20:41:03 -04:00
adamlamers	fa171176fc	media input refinement Continuous Integration / backend-tests (push) Successful in 36s Details Continuous Integration / frontend-check (push) Successful in 15s Details Continuous Integration / e2e-tests (push) Successful in 5m46s Details	2026-05-05 20:07:35 -04:00
adamlamers	9e51247564	fast discover was also slower than os.walk Continuous Integration / e2e-tests (push) Successful in 5m18s Details Continuous Integration / backend-tests (push) Successful in 38s Details Continuous Integration / frontend-check (push) Successful in 15s Details	2026-05-05 19:36:51 -04:00
adamlamers	4d4d9fa1e0	remove 'fast hashing' that was actually slower Continuous Integration / backend-tests (push) Successful in 39s Details Continuous Integration / frontend-check (push) Successful in 15s Details Continuous Integration / e2e-tests (push) Successful in 5m57s Details	2026-05-05 19:13:32 -04:00
adamlamers	c3457308ba	make test_list_jobs_populated deterministic Continuous Integration / backend-tests (push) Successful in 40s Details Continuous Integration / frontend-check (push) Successful in 14s Details Continuous Integration / e2e-tests (push) Successful in 5m12s Details	2026-05-05 18:54:41 -04:00
adamlamers	1ef2c194db	media tests Continuous Integration / e2e-tests (push) Successful in 5m22s Details Continuous Integration / backend-tests (push) Successful in 38s Details Continuous Integration / frontend-check (push) Successful in 16s Details	2026-05-05 18:48:47 -04:00
adamlamers	d77a79876f	cloud provider coverage Continuous Integration / backend-tests (push) Successful in 39s Details Continuous Integration / frontend-check (push) Successful in 16s Details Continuous Integration / e2e-tests (push) Successful in 6m48s Details	2026-05-05 18:38:42 -04:00
adamlamers	ae74a0bf02	more test improvements & new tests Continuous Integration / backend-tests (push) Successful in 36s Details Continuous Integration / frontend-check (push) Successful in 18s Details Continuous Integration / e2e-tests (push) Successful in 5m13s Details	2026-05-05 17:26:03 -04:00
adamlamers	f44895d40f	more checks in archiver & scanner tests Continuous Integration / backend-tests (push) Successful in 33s Details Continuous Integration / frontend-check (push) Successful in 16s Details Continuous Integration / e2e-tests (push) Successful in 5m11s Details	2026-05-05 17:13:47 -04:00
adamlamers	c76ccd0dfa	strengthen tests Continuous Integration / backend-tests (push) Successful in 42s Details Continuous Integration / frontend-check (push) Successful in 16s Details Continuous Integration / e2e-tests (push) Successful in 5m29s Details	2026-05-05 17:02:59 -04:00
adamlamers	06eb00ab3e	test cleanup	2026-05-05 15:28:31 -04:00