Overview
HA Sync is a homelab service that keeps six NFS-exported data folders in sync between two
physical servers — a Dell OptiPlex 7070 (192.168.2.100) and an
HP ProLiant DL360 G7 (192.168.2.193) — so that either machine
can take over if the other goes down.
Sync runs as Kubernetes CronJobs every 15 minutes. Each folder pair has two jobs: one copying Dell → HP, and one copying HP → Dell (bidirectional, last-writer-wins).
--dry-run
until you remove that flag. The dashboard shows what would be synced —
no files are actually moved until you enable real mode.
Sync Pairs
| Pair | Dell path | HP path | Description |
|---|---|---|---|
media | /data/media | /data/media | Movies, TV shows, music |
photos | /data/photos | /data/photos | Personal photo library |
owncloud | /data/owncloud | /data/owncloud | OwnCloud user data |
games | /data/games | /data/games | Game storage |
infra | /data/infra | /data/infra | Infrastructure configs & DB data |
ai | /data/ai | /data/ai | AI model weights & datasets |
How a Sync Run Works
- Lease acquisition — the CronJob pod acquires a Kubernetes
Leaseobject (coordination.k8s.io/v1) namedha-sync-<pair>. If another pod for the same pair is already running, it exits immediately. The lease is heartbeated everyTTL/3seconds and auto-expires on crash. - Tree walk — source and destination directories are walked in parallel. Each file's path, size, and modification time are collected into a hash map.
- Comparison — files are compared by mtime + size. If they differ by less than 2 seconds (configurable), they are considered equal and skipped. On a mtime/size mismatch an MD5 comparison is triggered to avoid false positives.
- Copy / delete — a configurable worker pool (default 4) processes
the operation queue.
os.Chtimes()preserves the source mtime on every copy, which prevents the reverse-direction job from re-copying the same file. - Opslog flush — each operation is appended to a local JSONL file
(
/var/log/ha-sync/, backed by NFS). After all ops complete, the file is bulk-inserted into MySQL and deleted. If MySQL is down, the file is retried on the next run.
Loop Prevention
Because sync is bidirectional, a naïve implementation would copy a file from A→B, then copy it back B→A on the next run, forever. HA Sync avoids this by preserving the source file's mtime on every copy. On the next run the comparison sees equal mtimes and skips the file.
In a write conflict (both sides modified the same file between runs), the newest mtime wins — the more recently modified copy is treated as the source of truth and overwrites the other.
Dry-Run & Idempotency
Running with --dry-run computes all would-be operations and saves them to
the sync_iterations / sync_operations tables with
dry_run = 1, but makes no file changes. The dashboard marks these rows
with a DRY badge.
If you trigger a second dry-run before anything changes on disk, the service detects that the new would-be op set is identical to the previous one, skips writing new DB rows, and prints "no changes since last dry-run".
Enabling Real Sync
When you are satisfied with the dry-run output, remove --dry-run from
the CronJob args:
kubectl -n infrastructure edit cronjob ha-sync-media-dell-to-hpRemove
--dry-run from .spec.jobTemplate.spec.template.spec.containers[0].args
Add
--delete-missing to the args of the primary direction CronJob.
Do not enable it on both directions simultaneously.
Infrastructure
| Component | Detail |
|---|---|
| Language | Go 1.22, single static binary |
| Locking | Kubernetes Lease (coordination.k8s.io/v1), no MySQL dependency for locks |
| Database | MySQL 9 (general-purpose-db StatefulSet, general_db schema) |
| Storage | NFS PersistentVolumes, RWX — both servers export /data/* |
| Schedule | Dell→HP every 15 min; HP→Dell at :07, :22, :37, :52 (staggered) |
| Workers | 4 concurrent copy goroutines per run (configurable) |
| Log retention | Opslog JSONL files kept for 10 days before purge |