homelab/deployment/README.md
Dan V deb6c38d7b chore: commit homelab setup — deployment, services, orchestration, skill
- Add .gitignore: exclude compiled binaries, build artifacts, and Helm
  values files containing real secrets (authentik, prometheus)
- Add all Kubernetes deployment manifests (deployment/)
- Add services source code: ha-sync, device-inventory, games-console,
  paperclip, parts-inventory
- Add Ansible orchestration: playbooks, roles, inventory, cloud-init
- Add hardware specs, execution plans, scripts, HOMELAB.md
- Add skills/homelab/SKILL.md + skills/install.sh to preserve Copilot skill
- Remove previously-tracked inventory-cli binary from git index

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-09 08:10:32 +02:00

150 lines
8 KiB
Markdown

# Homelab Kubernetes Deployment Manifests
Reconstructed 2026-03-20 from live cluster state using `kubectl.kubernetes.io/last-applied-configuration` annotations.
## Directory Structure
```
deployment/
├── 00-namespaces.yaml # All namespace definitions — apply first
├── games/
│ ├── factorio.yaml # Factorio server (hostPort 34197)
│ ├── minecraft-cheats.yaml # Minecraft cheats (hostPort 25111)
│ ├── minecraft-creative.yaml # Minecraft creative (hostPort 25559)
│ ├── minecraft-home.yaml # Minecraft home (hostPort 25112)
│ ├── minecraft-jaron.yaml # Minecraft jaron (hostPort 25564)
│ ├── minecraft-johannes.yaml # Minecraft johannes (hostPort 25563)
│ ├── minecraft-noah.yaml # Minecraft noah (hostPort 25560)
│ └── openttd.yaml # OpenTTD (NodePort 30979/30978)
├── monitoring/
│ └── prometheus-pv.yaml # Manual local-storage PV for Prometheus
├── infrastructure/
│ ├── cert-issuers.yaml # ClusterIssuers: letsencrypt-prod + staging
│ ├── dns-updater.yaml # DaemonSet + ConfigMap (DigitalOcean DynDNS)
│ ├── general-db.yaml # MySQL 9 StatefulSet (shared DB for speedtest etc.)
│ ├── paperclip.yaml # Paperclip AI — PV + Deployment + Service + Ingress
│ └── speedtest-tracker.yaml # Speedtest Tracker + ConfigMap + Ingress
├── storage/
│ ├── owncloud.yaml # OwnCloud server + ConfigMap + Ingress
│ ├── owncloud-mariadb.yaml # MariaDB 10.6 StatefulSet
│ └── owncloud-redis.yaml # Redis 6 Deployment
├── media/
│ ├── jellyfin.yaml # Jellyfin + ConfigMap + Ingress
│ └── immich.yaml # Immich full stack (server, ml, db, valkey) + Ingress
├── iot/
│ ├── iot-db.yaml # MySQL 9 StatefulSet for IoT data
│ └── iot-api.yaml # IoT API (local image, see note below)
├── ai/
│ └── ollama.yaml # Ollama (currently scaled to 0)
├── default/
│ └── dns-updater-legacy.yaml # Legacy default-ns resources (hp-fast-pv, old ollama)
└── helm/
├── nfs-provisioners/ # Values for all NFS subdir provisioner releases
│ ├── values-vtrak.yaml # nfs-vtrak (default StorageClass)
│ ├── values-general.yaml # nfs-general (500G quota)
│ ├── values-general-db.yaml # nfs-general-db (20G quota)
│ ├── values-immich.yaml # nfs-immich (300G quota)
│ ├── values-jellyfin.yaml # nfs-jellyfin (700G quota)
│ ├── values-owncloud.yaml # nfs-owncloud (200G quota)
│ ├── values-minecraft.yaml # nfs-minecraft (50G quota)
│ ├── values-factorio.yaml # nfs-factorio (10G quota)
│ ├── values-openttd.yaml # nfs-openttd (5G quota)
│ ├── values-speedtest.yaml # nfs-speedtest (5G quota)
│ ├── values-authentik.yaml # nfs-authentik (20G quota)
│ └── values-iot.yaml # nfs-iot (20G quota)
├── cert-manager/
│ └── values.yaml # cert-manager v1.19.3 (crds.enabled=true)
├── ingress-nginx/
│ └── values.yaml # ingress-nginx v4.14.3 (DaemonSet, hostPort)
├── monitoring/
│ └── prometheus-values.yaml # kube-prometheus-stack (Grafana NodePort 31473)
└── authentik/
├── values.yaml # Authentik SSO v2026.2.1
└── redis-values.yaml # Standalone Redis for Authentik
```
## Apply Order
For a fresh cluster, apply in this order:
```bash
BASE=/home/dan/homelab/deployment
# 1. Namespaces
kubectl apply -f $BASE/00-namespaces.yaml
# 2. NFS provisioners (Helm) — run from default namespace
helm install nfs-vtrak nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
-f $BASE/helm/nfs-provisioners/values-vtrak.yaml
# ... repeat for each nfs-* values file
# 3. cert-manager
helm install cert-manager cert-manager/cert-manager -n cert-manager --create-namespace \
-f $BASE/helm/cert-manager/values.yaml
# 4. ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx -n infrastructure \
-f $BASE/helm/ingress-nginx/values.yaml
# 5. ClusterIssuers (requires cert-manager to be ready)
# Create the digitalocean-dns-token secret first:
# kubectl create secret generic digitalocean-dns-token \
# --from-literal=access-token=<TOKEN> -n cert-manager
kubectl apply -f $BASE/infrastructure/cert-issuers.yaml
# 6. Prometheus PV (must exist before helm install)
kubectl apply -f $BASE/monitoring/prometheus-pv.yaml
helm install obs prometheus-community/kube-prometheus-stack -n monitoring \
-f $BASE/helm/monitoring/prometheus-values.yaml
# 7. Infrastructure workloads (create secrets first — see comments in each file)
kubectl apply -f $BASE/infrastructure/dns-updater.yaml
kubectl apply -f $BASE/infrastructure/general-db.yaml
kubectl apply -f $BASE/infrastructure/speedtest-tracker.yaml
kubectl apply -f $BASE/infrastructure/paperclip.yaml
# 8. Storage
kubectl apply -f $BASE/storage/
# 9. Media
kubectl apply -f $BASE/media/
# 10. Games
kubectl apply -f $BASE/games/
# 11. IoT
kubectl apply -f $BASE/iot/
# 12. AI
kubectl apply -f $BASE/ai/
# 13. Authentik
helm install authentik-redis bitnami/redis -n infrastructure \
-f $BASE/helm/authentik/redis-values.yaml
helm install authentik authentik/authentik -n infrastructure \
-f $BASE/helm/authentik/values.yaml
```
## Secrets Required (not stored here)
The following secrets must be created manually before applying the relevant workloads:
| Secret | Namespace | Keys | Used By |
|--------|-----------|------|---------|
| `dns-updater-secret` | infrastructure | `digitalocean-token` | dns-updater DaemonSet |
| `digitalocean-dns-token` | cert-manager | `access-token` | ClusterIssuer (DNS01 solver) |
| `general-db-secret` | infrastructure | `root-password`, `database`, `user`, `password` | general-purpose-db, speedtest-tracker |
| `paperclip-secrets` | infrastructure | `BETTER_AUTH_SECRET` | paperclip |
| `owncloud-db-secret` | storage | `root-password`, `user`, `password`, `database` | owncloud-mariadb, owncloud-server |
| `iot-db-secret` | iot | `root-password`, `database`, `user`, `password` | iot-db, iot-api |
| `immich-secret` | media | `db-username`, `db-password`, `db-name`, `jwt-secret` | immich-server, immich-db |
## Key Notes
- **kube-node-1 is cordoned** — no general workloads schedule there. Exceptions: DaemonSets (dns-updater, ingress-nginx, node-exporter, flannel) and workloads with explicit `nodeSelector: kubernetes.io/hostname: kube-node-1` (paperclip).
- **NFS storage** — all app data lives on ZFS datasets on the HP ProLiant (`192.168.2.193:/VTrak-Storage/<app>`). The NFS provisioners in the `default` namespace handle dynamic PV provisioning.
- **Prometheus** — intentionally uses `local-storage` at `/kube-storage-room/prometheus/` on kube-node-1 (USB disk sde). The `prometheus-storage-pv` PV must be manually created.
- **Paperclip** — uses local image `paperclip:latest` with `imagePullPolicy: Never`, pinned to kube-node-1. The image must be built locally on that node.
- **iot-api** — currently broken (`ErrImageNeverPull` on kube-node-3). The `iot-api:latest` local image is not present on the worker nodes. Either add a nodeSelector or push to a registry.
- **Ollama** — the `ai/ollama` and `default/ollama` deployments are both scaled to 0. Active LLM serving happens on the openclaw VM (192.168.2.88) via systemd Ollama service.
- **Authentik** — `helm/authentik/values.yaml` contains credentials in plaintext. Treat this file as sensitive.