Dan V deb6c38d7b chore: commit homelab setup — deployment, services, orchestration, skill

- Add .gitignore: exclude compiled binaries, build artifacts, and Helm
  values files containing real secrets (authentik, prometheus)
- Add all Kubernetes deployment manifests (deployment/)
- Add services source code: ha-sync, device-inventory, games-console,
  paperclip, parts-inventory
- Add Ansible orchestration: playbooks, roles, inventory, cloud-init
- Add hardware specs, execution plans, scripts, HOMELAB.md
- Add skills/homelab/SKILL.md + skills/install.sh to preserve Copilot skill
- Remove previously-tracked inventory-cli binary from git index

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-04-09 08:10:32 +02:00

22 KiB

Raw Blame History

HA Sync — Execution Plan

Problem Statement

Two servers (Dell OptiPlex 7070 at 192.168.2.100 and HP ProLiant at 192.168.2.193) each export the same folder set over NFS. A Kubernetes-native tool must keep each folder pair in bidirectional sync: newest file wins, mtime is preserved on copy, delete propagation is strict (one-way per CronJob), and every operation is logged in the MySQL instance in the infrastructure namespace.

Architecture Decisions (Agreed)

Decision	Choice	Rationale
Language	Go	Single static binary, excellent async I/O, no runtime overhead
Sync direction	Bidirectional via two one-way CronJobs	Each folder pair gets `a→b` and `b→a` jobs; newest-mtime wins
Loop prevention	Preserve mtime on copy + `--delete-missing` flag	Mtime equality → skip; no extra DB state needed
Lock	Kubernetes `Lease` object (coordination.k8s.io/v1)	Native K8s TTL; survives MySQL outage; sync blocked only if K8s API is down (already required for CronJob)
Change detection	mtime + size first; MD5 only on mtime/size mismatch	Efficient for large datasets
Delete propagation	Strict mirror — configurable per job via `--delete-missing`	See ⚠️ note below
Volume access	NFS mounts (both servers already export NFS)	No HostPath or node-affinity needed

| Audit logging | Write to opslog file during run; flush to MySQL on completion | MySQL outage does not block sync; unprocessed opslogs are retried on next run | | Opslog storage | Persistent NFS-backed PVC at /var/log/ha-sync/ | /tmp is ephemeral (lost on pod exit); NFS PVC persists across CronJob runs for 10-day retention |

Locking: Kubernetes Lease

Each sync pair uses a coordination.k8s.io/v1 Lease object named ha-sync-<pair> in the infrastructure namespace.

spec.holderIdentity = <pod-name>/<iteration-id>
spec.leaseDurationSeconds = --lock-ttl (default 3600)
A background goroutine renews (spec.renewTime) every leaseDurationSeconds / 3 seconds
On normal exit or SIGTERM: Lease is deleted (released)
Stale leases (holder crashed without release): expire automatically after leaseDurationSeconds
Requires RBAC: ServiceAccount with create/get/update/delete on leases in infrastructure

Audit Logging: Opslog + MySQL Flush

On sync start: open /var/log/ha-sync/opslog-<pair>-<direction>-<RFC3339>.jsonl
Each file operation: append one JSON line (all sync_operations fields)
On sync end: attempt flush to MySQL (sync_iterations + sync_operations batch INSERT)
On successful flush: delete the opslog file
On MySQL failure: leave the opslog; on next run, scan /var/log/ha-sync/ for unprocessed opslogs and retry flush before starting new sync
Cleanup: after each run, delete opslogs older than 10 days (os.Stat mtime check)

⚠️ Delete Propagation Warning

With two one-way jobs per pair, ordering matters for deletes. If dell→hp runs before hp→dell and --delete-missing is ON for both, files that only exist on HP will be deleted before they're copied to Dell.

Safe default: --delete-missing=false for all jobs. Enable --delete-missing=true only on the primary direction (e.g., dell→hp for each pair) once the initial full sync has completed and both sides are known-equal.

NFS Sync Pairs

Pair name	Dell NFS (192.168.2.100)	HP NFS (192.168.2.193)
`media`	`/data/media`	`/data/media`
`photos`	`/data/photos`	`/data/photos`
`owncloud`	`/data/owncloud`	`/data/owncloud`
`games`	`/data/games`	`/data/games`
`infra`	`/data/infra`	`/data/infra`
`ai`	`/data/ai`	`/data/ai`

Each pair produces two CronJobs in the infrastructure namespace.

CLI Interface (`ha-sync`)

ha-sync [flags]

Required:
  --src <path>            Source directory (absolute path inside pod)
  --dest <path>           Destination directory (absolute path inside pod)
  --pair <name>           Logical pair name (e.g. "media"); used as Lease name ha-sync-<pair>

Optional:
  --direction <str>       Label for logging, e.g. "dell-to-hp" (default: "fwd")
  --db-dsn <dsn>          MySQL DSN (default: from env HA_SYNC_DB_DSN)
  --lock-ttl <seconds>    Lease TTL before considered stale (default: 3600)
  --log-dir <path>        Directory for opslog files (default: /var/log/ha-sync)
  --log-retain-days <n>   Delete opslogs older than N days (default: 10)
  --mtime-threshold <s>   Seconds of tolerance for mtime equality (default: 2)
  --delete-missing        Delete dest files not present in src (default: false)
  --workers <n>           Concurrent file workers (default: 4)
  --dry-run               Compute what would sync, save to DB as dry_run rows, print plan; do not copy/delete (default: false)
  --verbose               Verbose output
  --help

MySQL Schema (database: `general_db`)

-- One row per CronJob execution
CREATE TABLE IF NOT EXISTS sync_iterations (
  id                     BIGINT AUTO_INCREMENT PRIMARY KEY,
  sync_pair              VARCHAR(255)  NOT NULL,
  direction              VARCHAR(64)   NOT NULL,
  src                    VARCHAR(512)  NOT NULL,
  dest                   VARCHAR(512)  NOT NULL,
  started_at             DATETIME(3)   NOT NULL,
  ended_at               DATETIME(3),
  status                 ENUM('running','success','partial_failure','failed') NOT NULL DEFAULT 'running',
  dry_run                TINYINT(1)    NOT NULL DEFAULT 0,
  files_created          INT           DEFAULT 0,
  files_updated          INT           DEFAULT 0,
  files_deleted          INT           DEFAULT 0,
  files_skipped          INT           DEFAULT 0,
  files_failed           INT           DEFAULT 0,
  total_bytes_transferred BIGINT       DEFAULT 0,
  error_message          TEXT,
  INDEX idx_pair    (sync_pair),
  INDEX idx_started (started_at),
  INDEX idx_dry_run (dry_run)
);

-- One row per individual file operation (flushed from opslog on sync completion)
CREATE TABLE IF NOT EXISTS sync_operations (
  id            BIGINT AUTO_INCREMENT PRIMARY KEY,
  iteration_id  BIGINT        NOT NULL,
  dry_run       TINYINT(1)    NOT NULL DEFAULT 0,
  operation     ENUM('create','update','delete') NOT NULL,
  filepath      VARCHAR(4096) NOT NULL,
  size_before   BIGINT,
  size_after    BIGINT,
  md5_before    VARCHAR(32),
  md5_after     VARCHAR(32),
  started_at    DATETIME(3)   NOT NULL,
  ended_at      DATETIME(3),
  status        ENUM('success','fail') NOT NULL,
  error_message VARCHAR(4096),
  INDEX idx_iteration (iteration_id),
  CONSTRAINT fk_iteration FOREIGN KEY (iteration_id) REFERENCES sync_iterations(id)
);

No sync_locks table — locking is handled by Kubernetes Lease objects.

Dry-run Idempotency Rules

--dry-run mode: walk source and dest, compute the full set of would-be operations (create/update/delete), save to DB with dry_run = 1, print the plan. No files are copied or deleted.
Idempotency check: before running a dry-run, query for the last successful dry-run iteration for (pair, direction):
```
SELECT id, started_at FROM sync_iterations
WHERE sync_pair = ? AND direction = ? AND dry_run = 1 AND status = 'success'
ORDER BY started_at DESC LIMIT 1;
```
Then re-walk the source and dest and compute the would-be operation set. Compare it against the sync_operations rows from that previous dry-run iteration (same set of filepath + operation + size_before). If identical → print "Dry-run already current as of <started_at>. Nothing has changed." and exit without writing new rows.
Production run (--dry-run not set): all queries for previous iterations use WHERE dry_run = 0. Dry-run rows are never considered for skip logic, idempotency, or status reporting in production runs.
Lease is still acquired during dry-run (prevents two dry-runs from racing each other).

Project Structure

services/ha-sync/
  cmd/ha-sync/
    main.go              # Sync CLI entry point
  cmd/ha-sync-ui/
    main.go              # Dashboard HTTP server entry point (serves ha-sync.vandachevici.ro)
  internal/
    config/
      config.go          # Config struct, defaults, validation (shared by both binaries)
    db/
      db.go              # MySQL connect, auto-migrate schema
      logging.go         # StartIteration, FinishIteration, BulkInsertOperations, LastDryRunOps
    lease/
      lease.go           # Acquire/release/heartbeat Kubernetes Lease object
    opslog/
      writer.go          # Append JSON lines to /var/log/ha-sync/opslog-<pair>-<direction>-<RFC3339>.jsonl
      flusher.go         # Scan for unprocessed opslogs, batch INSERT; cleanup logs >10 days
    sync/
      engine.go          # Main sync loop: walk, compare, dispatch; dryRun flag skips writes
      walker.go          # Recursive directory walk
      compare.go         # mtime+size comparison; conditional MD5
      copy.go            # File copy with os.Chtimes() mtime preservation
      delete.go          # Safe delete with pre-check
    ui/
      handler.go         # HTTP handlers: index, /api/iterations, /api/operations, /api/pairs
      templates/
        index.html       # Dashboard HTML; auto-refreshes every 10s via fetch(); vanilla JS only
  go.mod
  go.sum
  Dockerfile             # Multi-stage: golang:1.22-alpine builder (builds ha-sync + ha-sync-ui) → alpine:3.20
  Makefile               # build, docker-build IMAGE=<registry>/ha-sync:latest, docker-push targets

deployment/ha-sync/
  serviceaccount.yaml    # ServiceAccount: ha-sync, namespace: infrastructure
  rbac.yaml              # Role + RoleBinding: leases (coordination.k8s.io) create/get/update/delete
  secret.yaml            # NOTE: create manually — see Phase 3C instructions
  pv-logs.yaml           # PersistentVolume: NFS 192.168.2.193:/data/infra/ha-sync-logs, 10Gi, RWX
  pvc-logs.yaml          # PVC bound to pv-logs; all CronJobs mount at /var/log/ha-sync
  pv-dell-<pair>.yaml    # PersistentVolume: NFS 192.168.2.100:/data/<pair>  (one per pair × 6)
  pv-hp-<pair>.yaml      # PersistentVolume: NFS 192.168.2.193:/data/<pair>  (one per pair × 6)
  pvc-dell-<pair>.yaml   # PVC → pv-dell-<pair>   (one per pair × 6)
  pvc-hp-<pair>.yaml     # PVC → pv-hp-<pair>     (one per pair × 6)
  cron-<pair>-dell-to-hp.yaml   # --dry-run is DEFAULT; remove flag to enable production sync
  cron-<pair>-hp-to-dell.yaml   # same
  ui-deployment.yaml     # Deployment: ha-sync-ui, 1 replica, image: <registry>/ha-sync:latest, cmd: ha-sync-ui
  ui-service.yaml        # ClusterIP Service: port 8080 → ha-sync-ui pod
  ui-ingress.yaml        # Ingress: ha-sync.vandachevici.ro → ui-service:8080; cert-manager TLS
  kustomization.yaml     # Kustomize root listing all resources

scripts/cli/
  ha-sync.md             # CLI reference doc

UI Dashboard (`ha-sync.vandachevici.ro`)

Binary: ha-sync-ui — Go HTTP server, port 8080
Routes:
- GET / — HTML dashboard; auto-refreshes via setInterval + fetch
- GET /api/pairs — JSON: per-pair last iteration summary (dry_run=0 and dry_run=1 separately)
- GET /api/iterations?pair=&limit=20 — JSON: recent iterations
- GET /api/operations?iteration_id= — JSON: operations for one iteration
Dashboard shows: per-pair status cards (last real sync, last dry-run, files created/updated/deleted/failed), recent activity table, errors highlighted in red
Env vars: HA_SYNC_DB_DSN (same secret as CronJobs)
K8s: Deployment in infrastructure namespace, 1 replica, same ServiceAccount as CronJobs (read-only DB access only)

Tasks

Parallelism key: Tasks marked [P] can be executed in parallel by separate agents. Tasks marked [SEQ] must follow the listed dependency chain.

Phase 0 — Scaffolding `[SEQ]`

Must complete before any code is written; all subsequent tasks depend on this.

#	Task	Command / Notes
0.1	Create Go module	`cd services/ha-sync && go mod init github.com/vandachevici/homelab/ha-sync`
0.2	Create directory tree	`mkdir -p cmd/ha-sync internal/{config,db,lease,opslog,sync}`
0.3	Create Dockerfile	Multi-stage: `FROM golang:1.22-alpine AS build` → `FROM alpine:3.20`; copy binary; `ENTRYPOINT ["/ha-sync"]`
0.4	Create Makefile	Targets: `build`, `docker-build IMAGE=<registry>/ha-sync:latest`, `docker-push IMAGE=...`

Phase 1 — Core Go packages `[P after Phase 0]`

Sub-tasks 1A, 1B, 1C, 1E are fully independent — assign to separate agents simultaneously. 1D depends on all of them.

1A — `internal/config` `[P]`

#	Task	Notes
1A.1	Write `config.go`	Define `Config` struct with all CLI flags; use `flag` stdlib or `cobra`; set defaults from CLI Interface section above

1B — `internal/db` `[P]`

#	Task	Notes
1B.1	Write `db.go`	`Connect(dsn string) (*sql.DB, error)`; run `CREATE TABLE IF NOT EXISTS` for both tables (include `dry_run TINYINT(1) NOT NULL DEFAULT 0` column in both) on startup
1B.2	Write `logging.go`	`StartIteration(dryRun bool, ...) (id int64)` → INSERT with `dry_run` set; `FinishIteration(id, status, counts)` → UPDATE; `BulkInsertOperations(iterID int64, dryRun bool, []OpRecord)` → batch INSERT; `LastDryRunOps(db, pair, direction string) ([]OpRecord, error)` → fetch ops for last successful `dry_run=1` iteration for idempotency check

1C — `internal/lease` `[P]`

#	Task	Notes
1C.1	Write `lease.go`	Use `k8s.io/client-go` in-cluster config; `Acquire(ctx, client, namespace, leaseName, holderID, ttlSec)` — create or update Lease if expired; `Release(ctx, client, namespace, leaseName, holderID)` — delete Lease; `Heartbeat(ctx, ...)` — goroutine that calls `Update` on `spec.renewTime` every `ttlSec/3` seconds

1D — `internal/opslog` `[P]`

#	Task	Notes
1D.1	Write `writer.go`	`Open(logDir, pair, direction string) (*Writer, error)` — creates `/var/log/ha-sync/opslog-<pair>-<direction>-<RFC3339>.jsonl`; `Append(op OpRecord) error` — JSON-encode one line
1D.2	Write `flusher.go`	`FlushAll(logDir string, db sql.DB) error` — scan dir for `.jsonl`, for each: decode lines → call `BulkInsertOperations`, delete file on success; `CleanOld(logDir string, retainDays int)` — delete files with mtime older than N days

1E — `internal/sync` `[P]`

#	Task	Notes
1E.1	Write `walker.go`	`Walk(root string) ([]FileInfo, error)` — returns slice of `{RelPath, AbsPath, Size, ModTime, IsDir}`; use `filepath.WalkDir`
1E.2	Write `compare.go`	`NeedsSync(src, dest FileInfo, threshold time.Duration) bool` — mtime+size check; `MD5File(path string) (string, error)` — streaming MD5; `MD5Changed(srcPath, destPath string) bool`
1E.3	Write `copy.go`	`CopyFile(src, dest string, srcModTime time.Time) error` — copy bytes, then `os.Chtimes(dest, srcModTime, srcModTime)` to preserve mtime
1E.4	Write `delete.go`	`DeleteFile(path string) error` — `os.Remove`; `DeleteDir(path string) error` — `os.RemoveAll` only if dir is empty after child removal
1E.5	Write `engine.go`	Walk src+dest, compare, dispatch create/update/delete via worker pool (`sync.WaitGroup` + buffered channel of `--workers` size); if `dryRun=true`, build op list but do not call copy/delete — return ops for caller to log; write each op to opslog.Writer (tagged with dry_run flag); return summary counts

1F — `cmd/ha-sync/main.go` `[SEQ, depends on 1A+1B+1C+1D+1E]`

#	Task	Notes
1F.1	Write `main.go`	Parse flags → build config → connect DB → flush old opslogs → acquire Lease → if `--dry-run`: call `LastDryRunOps`, walk src+dest, compute would-be ops, compare; if identical → print "already current" + exit; else run engine(dryRun=true) → open opslog writer (tagged dry_run) → start iteration row (`dry_run` = true/false) → run engine → finish iteration → flush opslog to DB → release Lease; trap SIGTERM to release Lease before exit; production queries always filter `dry_run = 0`

Phase 2 — Build & Docker Image `[SEQ after Phase 1]`

#	Task	Command
2.1	Fetch Go deps	`cd services/ha-sync && go mod tidy`
2.2	Build binary	`cd services/ha-sync && make build`
2.3	Build Docker image	`make docker-build IMAGE=192.168.2.100:5000/ha-sync:latest` (replace registry if different)
2.4	Push Docker image	`make docker-push IMAGE=192.168.2.100:5000/ha-sync:latest`

Phase 3 — Kubernetes Manifests `[P, can start during Phase 1]`

All manifest sub-tasks are independent and can be parallelized.

3A — RBAC + Shared Resources `[P]`

#	Task	Notes
3A.1	Create `serviceaccount.yaml`	`name: ha-sync`, `namespace: infrastructure`
3A.2	Create `rbac.yaml`	`Role` with rules: `apiGroups: [coordination.k8s.io]`, `resources: [leases]`, `verbs: [create, get, update, delete]`; `RoleBinding` binding `ha-sync` SA to the Role
3A.3	Create `pv-logs.yaml` + `pvc-logs.yaml`	PV: `nfs.server: 192.168.2.193`, `nfs.path: /data/infra/ha-sync-logs`, capacity `10Gi`, `accessModes: [ReadWriteMany]`; PVC: `storageClassName: ""`, `volumeName: pv-ha-sync-logs`, namespace `infrastructure`

3B — PVs and PVCs per pair `[P]`

#	Task	Notes
3B.1	Create `pv-dell-<pair>.yaml` for each of 6 pairs	`spec.nfs.server: 192.168.2.100`, `spec.nfs.path: /data/<pair>`; capacity per pair: `media: 2Ti`, `photos: 500Gi`, `games: 500Gi`, `owncloud: 500Gi`, `infra: 100Gi`, `ai: 500Gi`; `accessModes: [ReadWriteMany]`
3B.2	Create `pv-hp-<pair>.yaml` for each of 6 pairs	Same structure; `spec.nfs.server: 192.168.2.193`
3B.3	Create `pvc-dell-<pair>.yaml` + `pvc-hp-<pair>.yaml`	`namespace: infrastructure`; `accessModes: [ReadWriteMany]`; `storageClassName: ""` (manual bind); `volumeName: pv-dell-<pair>` / `pv-hp-<pair>`

3C — CronJobs `[P, depends on 3A+3B for volume/SA names]`

#	Task	Notes
3C.1	Create `cron-<pair>-dell-to-hp.yaml` for each pair	`namespace: infrastructure`; `serviceAccountName: ha-sync`; `schedule: "/15 * * *"`; image: `<registry>/ha-sync:latest`; args: `["--src=/mnt/dell/<pair>","--dest=/mnt/hp/<pair>","--pair=<pair>","--direction=dell-to-hp","--db-dsn=$(HA_SYNC_DB_DSN)","--log-dir=/var/log/ha-sync"]`; volumeMounts: `pvc-dell-<pair>` → `/mnt/dell/<pair>`, `pvc-hp-<pair>` → `/mnt/hp/<pair>`, `pvc-ha-sync-logs` → `/var/log/ha-sync`; envFrom: `ha-sync-db-secret`
3C.2	Create `cron-<pair>-hp-to-dell.yaml` for each pair	Same but src/dest swapped, `direction=hp-to-dell`; offset schedule by 7 min: `"7,22,37,52 * * * *"`
3C.3	Create `secret.yaml`	Comment-only file; actual secret created manually: `kubectl create secret generic ha-sync-db-secret --from-literal=HA_SYNC_DB_DSN='<user>:<pass>@tcp(general-purpose-db.infrastructure.svc.cluster.local:3306)/general_db' -n infrastructure`
3C.4	Create `kustomization.yaml`	Resources in order: `serviceaccount.yaml`, `rbac.yaml`, `pv-logs.yaml`, `pvc-logs.yaml`, all `pv-.yaml`, all `pvc-.yaml`, all `cron-*.yaml`

Phase 4 — CLI Documentation `[P, independent]`

#	Task	Notes
4.1	Create `scripts/cli/ha-sync.md`	Document all flags, defaults, example invocations, env vars (`HA_SYNC_DB_DSN`); note `--dry-run` for safe first-run; note `--delete-missing` rollout guidance

Phase 5 — Deploy & Verify `[SEQ after Phase 2+3]`

#	Task	Command
5.1	Create DB secret	`kubectl create secret generic ha-sync-db-secret --from-literal=HA_SYNC_DB_DSN='<user>:<pass>@tcp(general-purpose-db.infrastructure.svc.cluster.local:3306)/general_db' -n infrastructure`
5.2	Apply manifests	`kubectl apply -k deployment/ha-sync/`
5.3	Dry-run smoke test	`kubectl create job ha-sync-test --from=cronjob/ha-sync-media-dell-to-hp -n infrastructure` then: `kubectl logs -l job-name=ha-sync-test -n infrastructure -f`
5.4	Verify Lease is created	`kubectl get lease ha-sync-media -n infrastructure -o yaml`
5.5	Verify DB rows	`kubectl exec -it <general-purpose-db-pod> -n infrastructure -- mysql -u<user> -p general_db -e "SELECT * FROM sync_iterations ORDER BY id DESC LIMIT 5;"`
5.6	Verify opslog flush	Check `/var/log/ha-sync/` on the logs PVC — no `.jsonl` files should remain after a successful run
5.7	Trigger real first run	Delete the test job; let CronJob run on schedule; observe `sync_operations` table

Open Questions / Future Work

MySQL HA: general-purpose-db is a single-replica StatefulSet — no HA. Since locking is now handled by K8s Lease and MySQL is only used for audit logging (with opslog fallback), a MySQL outage won't block sync. If full MySQL HA is later desired, MariaDB Galera Cluster (3 replicas) is the recommended path for this homelab.
Conflict resolution: Currently "newest mtime wins". If clocks drift between nodes, a file could ping-pong. Consider NTP enforcement across all nodes or use --mtime-threshold >= observed clock skew.
Delete safety: --delete-missing defaults to false. Staged rollout: run one full cycle disabled first → confirm parity → enable on primary direction only.
Alerting: Add a Prometheus/Grafana alert on sync_iterations.status = 'failed' (query general_db directly or expose a future /metrics endpoint).
DB retention: sync_operations will grow large. Add a cleanup step: DELETE FROM sync_operations WHERE started_at < NOW() - INTERVAL 30 DAY as a weekly CronJob.
Registry: Dockerfile assumes local registry at 192.168.2.100:5000. Confirm registry address before Phase 2.

22 KiB Raw Blame History Unescape Escape

HA Sync — Execution Plan

Problem Statement

Architecture Decisions (Agreed)

Locking: Kubernetes Lease

Audit Logging: Opslog + MySQL Flush

⚠️ Delete Propagation Warning

NFS Sync Pairs

CLI Interface (ha-sync)

MySQL Schema (database: general_db)

Dry-run Idempotency Rules

Project Structure

UI Dashboard (ha-sync.vandachevici.ro)

Tasks

Phase 0 — Scaffolding [SEQ]

Phase 1 — Core Go packages [P after Phase 0]

1A — internal/config [P]

1B — internal/db [P]

1C — internal/lease [P]

1D — internal/opslog [P]

1E — internal/sync [P]

1F — cmd/ha-sync/main.go [SEQ, depends on 1A+1B+1C+1D+1E]

Phase 2 — Build & Docker Image [SEQ after Phase 1]

Phase 3 — Kubernetes Manifests [P, can start during Phase 1]

3A — RBAC + Shared Resources [P]

3B — PVs and PVCs per pair [P]

3C — CronJobs [P, depends on 3A+3B for volume/SA names]

Phase 4 — CLI Documentation [P, independent]

Phase 5 — Deploy & Verify [SEQ after Phase 2+3]