Homelab Specs
Hardware
Dell OptiPlex 7070
- Role: kube-node-1 (control-plane + worker), bare metal
- IP: 192.168.2.100
- SSH:
dan@192.168.2.100
- CPU: Intel Core i5-9500, 6c/6t, 3.0 GHz base / 4.4 GHz boost, 9 MB L3, 65W TDP, VT-x
- RAM: 16 GB DDR4 2666 MT/s DIMM
- Storage:
nvme0: Samsung PM991 256 GB — 1G EFI, 2G /boot, 235.4G LVM (100G → /)
sda: Seagate Expansion 2 TB → /data/photos (ext4)
sdb: Seagate Expansion+ 2 TB → /mnt/sdb-ro (ext4, READ-ONLY — never touch)
sdc1: Seagate Expansion 1 TB → /data/media (ext4)
sdc2: Seagate Expansion 788 GB → /data/games (ext4)
sdd: Samsung HD103SI 1 TB → /data/owncloud (ext4)
sde: Hitachi HTS545050 500 GB → /data/infra (ext4)
sdf: Seagate 1 TB → /data/ai (ext4)
- Total: ~7 TB
- Network: 1 Gbit/s
- NFS server: exports
/data/{games,media,photos,owncloud,infra,ai} to LAN
HP ProLiant DL360 G7
- Role: Proxmox hypervisor (192.168.2.193)
- SSH:
root@192.168.2.193 (local id_rsa)
- Web UI: https://proxmox.vandachevici.ro
- Storage:
- 2× HPE SAS 900 GB in RAID 1+0 → 900 GB usable (Proxmox OS)
- 4× HPE SAS 900 GB in RAID 1+0 → 1.8 TB usable (VM disks)
- Promise VTrak J830s: 2× 16 TB →
media-pool (ZFS, ~14 TB usable)
- Total: ~18 TB
Promise VTrak J830s
- Connected to HP ProLiant via SAS
- 2× 16 TB disks, ZFS pool
media-pool
- ZFS datasets mounted at
/data/X on HP (matching Dell paths)
Storage Layout
Dell /data drives (primary/local)
| Mount |
Device |
Size |
Contents |
/data/games |
sdc2 |
788 GB |
Game server worlds and kits |
/data/media |
sdc1 |
1.1 TB |
Jellyfin media library |
/data/photos |
sda |
916 GB |
Immich photo library |
/data/owncloud |
sdd |
916 GB |
OwnCloud files |
/data/infra |
sde |
458 GB |
Prometheus, infra data |
/data/ai |
sdf |
916 GB |
Paperclip, Ollama models |
/mnt/sdb-ro |
sdb |
1.8 TB |
READ-ONLY archive — never modify |
HP VTrak ZFS datasets (HA mirrors)
| ZFS Dataset |
Mountpoint on HP |
NFS export |
| media-pool/jellyfin |
/data/media |
✅ |
| media-pool/immich |
/data/photos |
✅ |
| media-pool/owncloud |
/data/owncloud |
✅ |
| media-pool/games |
/data/games |
✅ |
| media-pool/minecraft |
/data/games/minecraft |
✅ |
| media-pool/factorio |
/data/games/factorio |
✅ |
| media-pool/openttd |
/data/games/openttd |
✅ |
| media-pool/infra |
/data/infra |
✅ |
| media-pool/ai |
/data/ai |
✅ |
Legacy bind mounts at /media-pool/X → /data/X preserved for K8s PV compatibility.
Cross-mounts (HA access)
| From |
Mount point |
To |
| Dell |
/mnt/hp/data-{games,media,photos,owncloud,infra,ai} |
HP VTrak NFS |
| HP |
/mnt/dell/data-{games,media,photos,owncloud,infra,ai} |
Dell NFS |
VMs on HP ProLiant (Proxmox)
| VM ID |
Name |
IP |
RAM |
Role |
| 100 |
kube-node-2 |
192.168.2.195 |
16 GB |
K8s worker |
| 101 |
kube-node-3 |
192.168.2.196 |
16 GB |
K8s control-plane + worker |
| 103 |
kube-arbiter |
192.168.2.200 |
6 GB |
K8s control-plane (etcd + API server, NoSchedule) |
| 104 |
local-ai |
192.168.2.88 |
— |
Ollama + openclaw-gateway (Tesla P4 GPU passthrough) |
| 106 |
ansible-control |
192.168.2.70 |
— |
Ansible control node |
| 107 |
remote-ai |
192.168.2.91 |
— |
openclaw-gateway (remote, cloud AI) |
⚠️ kube-node-2, kube-node-3, and kube-arbiter are all VMs on the HP ProLiant. HP ProLiant failure = loss of 3/4 K8s nodes simultaneously. Mitigation: add a Raspberry Pi 4/5 (8 GB) as a 4th physical host.
SSH: dan@<ip> for all VMs
Kubernetes Cluster
- Version: 1.32.13
- CNI: Flannel
- Dashboard: https://192.168.2.100:30443 (self-signed cert, token auth)
- Token file:
/home/dan/homelab/kube/cluster/DASHBOARD-ACCESS.txt
- StorageClass:
local-storage (hostPath on kube-node-1)
- NFS provisioners:
nfs-provisioners namespace (nfs-subdir-external-provisioner)
Nodes
| Node |
Role |
IP |
Host |
| kube-node-1 |
control-plane + worker |
192.168.2.100 |
Dell OptiPlex 7070 (bare metal) |
| kube-node-2 |
worker |
192.168.2.195 |
VM on HP ProLiant (16 GB RAM) |
| kube-node-3 |
control-plane + worker |
192.168.2.196 |
VM on HP ProLiant (16 GB RAM) |
| kube-arbiter |
control-plane |
192.168.2.200 |
VM on HP ProLiant (1c/6GB, tainted NoSchedule) |
etcd: 3 members (kube-node-1 + kube-arbiter + kube-node-3) — quorum survives 1 member failure ✅
controlPlaneEndpoint: 192.168.2.100:6443 ⚠️ SPOF — kube-vip (Phase 1b) not yet deployed; if kube-node-1 goes down, workers lose API access even though kube-arbiter and kube-node-3 API servers are still running
High Availability Status
Control Plane
| Component |
Status |
Notes |
| etcd |
✅ 3 members |
kube-node-1 + kube-arbiter + kube-node-3; tolerates 1 failure |
| API server VIP |
⚠️ Not yet deployed |
controlPlaneEndpoint hardcoded to 192.168.2.100; kube-vip (Phase 1b) pending |
| CoreDNS |
✅ Required anti-affinity |
Pods spread across different nodes (kube-node-1 + kube-node-2) |
Workloads (replicas=2, required pod anti-affinity)
| Service |
Replicas |
PDB |
| authentik-server |
2 |
✅ |
| authentik-worker |
2 |
✅ |
| cert-manager |
2 |
✅ |
| cert-manager-webhook |
2 |
✅ |
| cert-manager-cainjector |
2 |
✅ |
| parts-api |
2 |
✅ |
| parts-ui |
2 |
✅ |
| ha-sync-ui |
2 |
✅ |
| games-console-backend |
2 |
✅ |
| games-console-ui |
2 |
✅ |
| ingress-nginx |
DaemonSet |
✅ (runs on all workers) |
Storage
| PV |
Type |
Notes |
| paperclip-data-pv |
NFS (192.168.2.252) |
✅ Migrated from hostPath; can schedule on any node |
| prometheus-storage-pv |
hostPath on kube-node-1 |
⚠️ Still pinned to kube-node-1 (out of scope) |
Known Remaining SPOFs
| Risk |
Description |
Mitigation |
| HP ProLiant physical host |
kube-node-2/3 + kube-arbiter are all HP VMs |
Add Raspberry Pi 4/5 (8 GB) as 4th physical host |
| controlPlaneEndpoint |
Hardcoded to kube-node-1 IP |
Deploy kube-vip with VIP (e.g. 192.168.2.50) |
games
| Service |
NodePort |
Storage |
| minecraft-home |
31112 |
HP NFS /data/games/minecraft |
| minecraft-cheats |
31111 |
HP NFS /data/games/minecraft |
| minecraft-creative |
31559 |
HP NFS /data/games/minecraft |
| minecraft-johannes |
31563 |
HP NFS /data/games/minecraft |
| minecraft-noah |
31560 |
HP NFS /data/games/minecraft |
| Factorio |
— |
HP NFS /data/games/factorio |
| OpenTTD |
— |
HP NFS /data/games/openttd |
Minecraft operators: LadyGisela5, tomgates24, anutzalizuk, toranaga_samma
monitoring
- Helm release:
obs, chart prometheus-community/kube-prometheus-stack
- Values file:
/home/dan/homelab/deployment/helm/prometheus/prometheus-helm-values.yaml
- Components: Prometheus, Grafana, AlertManager, Node Exporter, Kube State Metrics
- Grafana: NodePort 31473 → http://192.168.2.100:31473
- Storage: 100 Gi hostPath PV at
/data/infra/prometheus on kube-node-1
infrastructure
- General MySQL/MariaDB (StatefulSet) — HP NFS
/media-pool/general-db
- Speedtest Tracker — HP NFS
/media-pool/speedtest
- DNS updater (DaemonSet,
tunix/digitalocean-dyndns) — updates DigitalOcean DNS
- Proxmox ingress → 192.168.2.193:8006
storage
- OwnCloud (
owncloud/server:10.12) — drive.vandachevici.ro, admin: sefu
- MariaDB (StatefulSet), Redis (Deployment), OwnCloud server (2 replicas)
- Storage: HP NFS
/data/owncloud
media
- Jellyfin — media.vandachevici.ro, storage: HP NFS
/data/media
- Immich — photos.vandachevici.ro, storage: HP NFS
/data/photos
- Components: server (2 replicas), ML (2 replicas), valkey, postgresql
iot
- IoT MySQL (StatefulSet, db:
iot_db)
- IoT API (
iot-api:latest, NodePort 30800) — requires topology.homelab/server: dell label
ai
- Paperclip — paperclip.vandachevici.ro
- Embedded PostgreSQL at
/data/ai/paperclip/instances/default/db
- Config:
/data/ai/paperclip/instances/default/config.json
- NFS PV via keepalived VIP
192.168.2.252:/data/ai/paperclip (can schedule on any node) ✅
- Env:
PAPERCLIP_AGENT_JWT_SECRET (in K8s secret)
AI / OpenClaw
local-ai VM (192.168.2.88) — GPU instance
- GPU: NVIDIA Tesla P4, 8 GB VRAM (PCIe passthrough from Proxmox)
- VFIO:
/etc/modprobe.d/vfio.conf ids=10de:1bb3, allow_unsafe_interrupts=1
- initramfs updated for persistence
- Ollama: listening on
0.0.0.0:11434, models at /data/ollama/models
- Loaded:
qwen3:8b (5.2 GB)
- openclaw-gateway:
ws://0.0.0.0:18789, auth mode: token
- Token: in
~/.openclaw/openclaw.json → gateway.auth.token
- Systemd:
openclaw-gateway.service (Type=simple, enabled)
remote-ai VM (192.168.2.91)
- openclaw-gateway: installed (v2026.3.13), config at
~/.openclaw/openclaw.json
- Uses cloud AI providers (Claude API key required)
Connecting Paperclip to openclaw
- URL:
ws://192.168.2.88:18789/
- Auth: token from
~/.openclaw/openclaw.json → gateway.auth.token
Network Endpoints
DNS subdomains managed (DigitalOcean)
photos, backup, media, chat, openttd, excalidraw, prv, drive, grafana, paperclip, proxmox
Common Operations
Apply manifests
kubectl apply -f /home/dan/homelab/deployment/<namespace>/
Prometheus (Helm)
helm upgrade obs prometheus-community/kube-prometheus-stack \
-n monitoring \
-f /home/dan/homelab/deployment/helm/prometheus/prometheus-helm-values.yaml
NFS provisioners (Helm)
# Example: jellyfin
helm upgrade nfs-jellyfin nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
-n nfs-provisioners \
-f /home/dan/homelab/deployment/helm/nfs-provisioners/values-jellyfin.yaml
Troubleshooting: Flannel CNI after reboot
If all pods stuck in ContainerCreating after reboot:
# 1. Check default route exists on kube-node-1
ip route show | grep default
# Fix: sudo ip route add default via 192.168.2.1 dev eno1
# Persist: check /etc/netplan/00-installer-config.yaml has routes section
# 2. Restart flannel pod on node-1
kubectl delete pod -n kube-flannel -l app=flannel --field-selector spec.nodeName=kube-node-1
Troubleshooting: kube-node-3 NotReady after reboot
Likely swap re-enabled:
ssh dan@192.168.2.196 "sudo swapoff -a && sudo sed -i 's|^/swap.img|#/swap.img|' /etc/fstab && sudo systemctl restart kubelet"
Workspace Structure
/home/dan/homelab/
├── HOMELAB.md — this file
├── plan.md — original rebuild plan
├── step-by-step.md — execution tracker
├── deployment/ — K8s manifests and Helm values
│ ├── 00-namespaces.yaml
│ ├── ai/ — Paperclip
│ ├── default/ — DNS updater
│ ├── games/ — Minecraft, Factorio, OpenTTD
│ ├── helm/ — Helm values (prometheus, nfs-provisioners)
│ ├── infrastructure/ — ingress-nginx, cert-manager, general-db, speedtest, proxmox-ingress
│ ├── iot/ — IoT DB + API
│ ├── media/ — Jellyfin, Immich
│ ├── monitoring/ — (managed by Helm)
│ └── storage/ — OwnCloud
├── backups/ — K8s secrets backup (gitignored)
├── hardware/ — hardware spec docs
├── orchestration/
│ └── ansible/ — playbooks, inventory, group_vars, cloud-init
└── services/
└── device-inventory/ — C++ CMake project: network device discovery