Between its support for LXCs, community scripts, and simple management UI, Proxmox has a ton of features to make home labs more accessible to beginners and casual users. Unlike its rivals (especially ESXi), Proxmox requires minimal CPU, memory, and storage provisions. It also works right-out-of-the-box with most hardware, making it a terrific option for budget-friendly setups. However, despite its approachable nature, Proxmox features plenty of advanced tools to enhance your home lab workloads.
Clustering is one such utility that, when combined with a handful of inexpensive PVE nodes, can turn them into reliable self-hosting workstations. In fact, …
Between its support for LXCs, community scripts, and simple management UI, Proxmox has a ton of features to make home labs more accessible to beginners and casual users. Unlike its rivals (especially ESXi), Proxmox requires minimal CPU, memory, and storage provisions. It also works right-out-of-the-box with most hardware, making it a terrific option for budget-friendly setups. However, despite its approachable nature, Proxmox features plenty of advanced tools to enhance your home lab workloads.
Clustering is one such utility that, when combined with a handful of inexpensive PVE nodes, can turn them into reliable self-hosting workstations. In fact, I’ve got some spare rigs that I’ve been itching to put to good use, and since a high-availability cluster seemed like a fun project, I figured I could try building one using a mixture of cheap devices and old hardware. And well, it works better than I expected!
What’s the point of creating a Proxmox cluster?
Centralized UI and high-availability provisions
Proxmox’s intuitive web UI includes all the settings, options, and toggles (and even a terminal shell) to help you tinker with virtual machines and containers. However, managing your virtual guests can be a challenge when you’ve got multiple PVE hosts. Take my setup, for instance. I’ve got an old PC that I tend to use for virtualization and distro-hopping experiments, and since it has enough RAM and CPU, it would make for a fine VM-hosting workstation. Meanwhile, my x86 SBCs and NAS units work well as LXC-hosting servers due to their limited processing capabilities. Combining the nodes inside a cluster would let me access them from a centralized web UI, and I wouldn’t have to constantly switch between different tabs when deploying my arsenal.
However, the real star of the show is the fault tolerance offered by a high-availability setup. I often conduct wacky experiments with my home lab equipment, which can potentially bring down my entire node if I mistakenly edit some config files. A high-availability cluster that can migrate my essential LXCs from a downed host to another node is the perfect remedy for this problem. Then there’s the fact that my backwater town often has power outages, and hooking two low-power nodes with my UPS should let me keep my mission-critical services operational even when my main experimentation system goes offline during a blackout.
I picked old and budget-friendly devices for my cluster
Before I discuss the procedure for creating my cluster environment, I want to go over the hardware powering the high-availability setup. Since I planned to use budget-friendly devices, my options were fairly limited. My Ryzen 5 1600 PC, which features 16GB memory, was an obvious choice, since I built it ages ago and didn’t have to pay a dime for it. For the secondary node, I went with the Aiffro K100, an all-SSD NAS that’s armed with an Intel N100 CPU alongside 8GB of memory.
I had a couple of options for the last node: I could either go for my decade-old laptop that’s currently serving as an LXC-hosting machine, or pick my favorite x86 SBC, the Radxa X4. However, their storage options were fairly limited, so I had to look into another alternative. In the end, I chose my ZimaBoard 2 as the last node and armed each machine with a 500GB NVMe SSD for Ceph storage (though I had to use a PCIe-to-NVMe adapter to pair the high-speed drive with the ZimaBoard 2).
Deploying the Proxmox cluster
It was a cakewalk
One of the biggest issues when setting up a Proxmox cluster is that your secondary nodes shouldn’t have any virtual guests deployed on them. Since I started with a fresh installation on each node, I didn’t have to use the old backup and restore method to avoid losing my VM and LXC data on the secondary nodes. That said, I had to disable the enterprise repos inside the Repositories tab for each node before adding the No-Subscription and Ceph Squid No-Subscription repos. Just to make things easier for myself, I used the Updates tab to install the latest packages on all nodes.
With the initial configuration complete, I logged into the Proxmox web UI on my old PC and headed to the Cluster section within the Datacenter tab. Then, I pressed the Create Cluster button and gave a** Name** to my soon-to-be created cluster environment before tapping the Create button. In less than a minute, my cluster was up and running. After copying the Join Information key, I switched to the Cluster tab on my NAS machine. Except, this time I went with the Join Cluster option, pasted the key I copied earlier, and hit the** Join** button. Once I’d repeated the process for the third node, I had my cluster up and running.
Setting up distributed storage
I went with good ol’ Ceph
Since a high-availability setup requires distributed storage, my work was far from done. Technically, I could’ve configured ZFS pools bearing the same name on all three hosts, but I went with a Ceph-based setup, partly because I needed its superior fault tolerance for my cluster, and also because I wanted to tinker with this fun technology.
Anyway, I switched to the Ceph tab under my primary node, pressed the Install Ceph button to grab the necessary packages, and tapped Yinside the terminal to install them. Then, I selected the IP address of my primary cluster node to finish configuring Ceph packages and repeated the previous step for the remaining nodes.
Next, it was time to add the 500GB drives as the Object Storage Devices, which was as simple as heading to the OSD tab, pressing Create: OSD, and picking the NVMe SSDs as the Disks. I also opened the Monitor tab and added the secondary Hosts via the Create button. Finally, I navigated to the Pools section, hit Create, and gave a** Name**to my clustered storage while leaving the rest of the options at their default values.
Tinkering with high-availability rules
And migrating my virtual guests between the nodes
Unlike its older versions, Proxmox 9 changes the high-availability options. Rather than supporting HA Groups, the latest Proxmox edition requires you to create rules for each virtual guest, and you can fine-tune which node gets to host them should the original one go offline.
Since I didn’t have any LXCs or VMs configured in the beginning, I used the Proxmox VE-Helper Scripts repo to deploy some containers on different nodes. In the meantime, I uploaded the ISO file for EndeavourOS to my old PC and used it to spin up a virtual machine. Just to confirm whether everything worked well, I used the Migrate option to send it to a different node – all while it was running. Within a few seconds, the VM was visible on my SBC-powered PVE host. I also repeated the experiment with LXCs, which migrated just as smoothly.
For the high-availability part of the equation, I headed to the HA tab within the Datacenter, pressed the Add button under Resources, and selected some LXCs. Then, I opened the Affinity Rules submenu and chose all three nodes for the high-availability setup. For the final test, I shut down the node running my CasaOS LXC, which prompted the cluster to migrate it to a different host.
A Proxmox cluster is worth all the hassle
I’ll admit that a cluster may be too overkill for the average home labber, especially considering all the extra hardware you’ll need to make it work. But since Proxmox requires extremely low system resources, you can build a powerful cluster from old devices. And few things are more satisfying than watching your essential LXCs come back to life on a different node when the original workstation goes down because of a failed experiment.