Cluster migration sanity check

GorgonzolaPrimavera

New Member
Jun 4, 2026
2
0
1
djn
Esteemed Proxmox peers,

I've now heard "opinions" from the artificial "intelligence" engines of note about my migration plan from "that other system" to pxmx. But I'd really like a sanity check from Real Brains to make sure I'm not heading into a deep crevasse.

If my first new Proxmox VE 9.1.1 Dell R730 host has a hefty heap of RAM, disk and CPU, can I put ZFS on all 12T of those drives then add the second, clean node to a new cluster based on the "founder" node? The second node would have only one of its four drives as ZFS and the other three drives untouched, bare, HBA drives. At this point, I (thought) I'd have a fledgling cluster with one node doing all the VM work as I drained the VMs off of host2 and crammed them into host1 for a day or so. Next, I'd spin up a qdevice just to keep the pxmx cluster sane and then do the same rinse and repeat on my third host - drain the VMs to some other virtualization host, turn it into pxmx with just a single 3.7T ZFS drive and three bare 3.7T drives and join the cluster. At this point I could fire up Ceph and migrate the VMs that are overcrowding host1 then once all those are drained, rebuild that host1 with the same layout - 1 OS ZFS drive and three bare HBA spindles that would get entered into the Ceph cluster. Hosts 4 and 5 would follow along just draining the VMs straight into the new cluster, convert them à la host 2 and 3 and add them back into the cluster.

What did I fail to consider in this? I assume if there's some VMs/LXCs running on host1 at first, it is still totally fair to add a bare node to a new cluster based on host1 being the founder. I know I cannot add hosts to a cluster if the new host has VMs on it. I wonder if there are other "gotchas" that are hidden in this plan. I've been kicking the tires on this for a few weeks now and thought this might be the easiest way to wrangle my way out of the "virtualization solution that shall not be named".

TIA,
Bill
 
So you’re re-using the existing nodes?

One documented gotcha is you must remove the Qdevice before adding or removing nodes.

Is Ceph going on rotating drives? It can but SSDs are optimal.
 
Yes, I'm bulldozing the existing hosts like Lightning McQueen through Radiator Springs. But I need the compute around because I don't have spare compute readily at hand.

So if I get to two nodes and add a qdevice, it really doesn't do much except as a complete stop-gap? If I ready node 3, zap the qdevice then add node 3, could I reuse it temporarily when I disconnected node 1 after the VMs on it got drained?

*Yes, all the drives are spinning rust. I do not have a good way to add an SSD even as a WAL.
 
Last edited:
Yes in that case the Qdevice would only serve a function if either node 1 or 2 was offline. Technically not needed unless one does e.g. a reboot. But safer. After you have 3 nodes I’d just wait until the end…where it’s better to have an odd number of votes.