High-Level Cluster build help, please

aweber1nj

Member
Dec 20, 2023
32
2
8
I have an existing PVE node with multiple VMs on LVM-Thin partition. (I also have a PBS for backups, which isn't important to the cluster, but may be helpful in the migration...)

I'd like to add a second PVE and get HA/cluster working. (Also have a QDevice ready)

To get the best setup in this small situation, I'm assuming I should build my second PVE node with all ZFS - or especially for the VM location.

Would it be possible to then move/migrate the production VMs to the new node (on ZFS) and then I'd have to "rebuild" the original node with ZFS partitions/pools to best support HA/replication?

I'm pretty sure this has probably been done before. Maybe there's a way to "convert" a LVM-Thin to a ZFS 1-dev pool (once the VMs are on the second node)?

Can anyone provide any guidance on best strategy to get from my current situation to the desired setup? Is my plan "close"?

Thanks in advance.
 
First, "best support", "HA" and "cluster" is only fun and working in a near-zero-loss scenario if you have 3 nodes and a shared storage (SAN or CEPH). This is very crucial to know that.

It should be possible to live migrate the VMs to the second node and reinstall the first one, yet I haven't done it in person on a live system. As an alternative, using PBS is always an option and will work flawlessly. Just stop your VM, do a backup with PBS and restore on the other node to have zero-loss and a cold filesystem.
 
Would it be possible to then move/migrate the production VMs to the new node (on ZFS) and then I'd have to "rebuild" the original node with ZFS partitions/pools to best support HA/replication?
yes.
Maybe there's a way to "convert" a LVM-Thin to a ZFS 1-dev pool (once the VMs are on the second node)?
You can wipe the disks are reformat them with zfs, sure.

Can anyone provide any guidance on best strategy to get from my current situation to the desired setup? Is my plan "close"?
you've described the process pretty clearly. As long as what you end up with is workable for you, the only thing I would add is to make sure you have a good backup before proceeding.
 
Thank you for the responses. To clarify, now that I have the cluster running.

To refresh: Original node has VMs and Containers running on a LVM-Thin partition. NEW/second node is not yet running anything, but has a ZFS pool setup (and available locally) with "Content=Disk Image, Container".

I am assuming I can not replicate these VMs or Containers to the new node (because the source is LVM-Thin)?

Should I migrate all the containers/VMs to the new node? Then I can "remove" the LVM-Thin storage from the original and replace it with a ZFS pool?

I'm pretty-sure once both nodes are using ZFS pools for the containers/VMs, I can setup replication for faster, future migrations?
 
I am assuming I can not replicate these VMs or Containers to the new node (because the source is LVM-Thin)?
You can, in more then one way. simplest is to use vzdump to make a backup and then restore on the destination node.
Should I migrate all the containers/VMs to the new node?
That is the simpler method, but using backup/restore method is safer AND provides you with a current backup.

once both nodes are using ZFS pools for the containers/VMs, I can setup replication for faster, future migrations?
yes. PVE has a tool to keep two zfs datasets synchronized.
 
Thank you for your assistance @alexskysilk !

Related question that I probably knew, but forgot: How is the "default" storage for a node determined? (If I choose to migrate a container, it appears I select a target NODE, but not the target storage.)
 
FWIW: I was unable to use the pct migrate command (even designating a -target-storage). It complained that "cannot migrate from storage type 'lvmthin' to 'zfspool' ".

I would have thought it to be storage agnostic (like copying a file from one mount point to another in linux does not care what the underlying filesystems are).

The downside to performing a Backup/Restore - besides the increased downtime - is that the IDs change (which impacts some reporting mostly). Also, I assume, the backup chain is broken. I will end-up with legacy backups for the previous ID and new backups for the new ID.
 
Last edited:
besides the increased downtime
it all depends on your orchestration. you can spin up the replacement, turn off the nic on the source, turn on the nic on the destination. virtually no downtime.

FWIW: I was unable to use the pct migrate command (even designating a -target-storage). It complained that "cannot migrate from storage type 'lvmthin' to 'zfspool' ".
I was not aware of the limitation. good to know.
is that the IDs change
This is completely in your hands. I dont advocate being married to vmid's under any circumstances, but if it really matters to you changing vmid's is trivial at any point. and if you'd rather not, just temporarily remove the source vmid.conf file when restoring, and you can keep the same vmid even at restoral (or change vmid of the source before restoration; or change the target vmid after removing the source etc etc)
 
Last edited:
  • Like
Reactions: aweber1nj