Advices for configuration of high availability in a cluster

radar

Member
May 11, 2021
30
2
13
49
Hi,
I'm new to this virtualisation things so I'm looking for advices on how to set-up my cluster.
I have 3 nodes, node1 has only one disk while nodes 2 and 3 have 2 disks each.I'd like to implement HA on this cluster but I don't want any external storage.
I installed pve with zfs option on node1 and that has created a zfs disk on this node, named rpool.
For the 2 other nodes, I can create zfs disks using the second ones.
Here are the storage confs for the 3 nodes.
  1. Node1:
Code:
root@pve1:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,backup,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

zfspool: test
        pool rpool/data
        content images,rootdir
        nodes pve1
        sparse 0

Code:
root@pve1:~# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.0-1
proxmox-backup-file-restore: 3.2.0-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.1
pve-cluster: 8.0.6
pve-container: 5.0.10
pve-docs: 8.2.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.5
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2

Code:
root@pve1:~# pvs
root@pve1:~# lvs
root@pve1:~# vgs
root@pve1:~#
  1. Node2:
Code:
root@pve2:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,backup,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

zfspool: test
        pool rpool/data
        content images,rootdir
        nodes pve1
        sparse 0

Code:
root@pve2:~# pveversion -v
proxmox-ve: 8.3.0 (running kernel: 6.8.12-8-pve)
pve-manager: 8.3.3 (running version: 8.3.3/f157a38b211595d6)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-8
proxmox-kernel-6.8.12-8-pve-signed: 6.8.12-8
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.2.0
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.2-1
proxmox-backup-file-restore: 3.3.2-2
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.4
pve-cluster: 8.0.10
pve-container: 5.2.3
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-3
pve-ha-manager: 4.0.6
pve-i18n: 3.3.3
pve-qemu-kvm: 9.0.2-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve1

Code:
root@pve2:~# pvs
  PV             VG  Fmt  Attr PSize    PFree
  /dev/nvme0n1p3 pve lvm2 a--  <475.94g 16.00g

Code:
root@pve2:~# lvs
  LV            VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data          pve twi-aotz-- <348.82g             5.33   0.62                           
  root          pve -wi-ao----   96.00g                                                   
  swap          pve -wi-ao----    8.00g                                                   
  vm-100-disk-0 pve Vwi-aotz--    4.00g data        54.36                                 
  vm-102-disk-0 pve Vwi-aotz--   10.00g data        86.63                                 
  vm-107-disk-0 pve Vwi-aotz--   10.00g data        31.24                                 
  vm-108-disk-0 pve Vwi-aotz--    8.00g data        57.93
Code:
root@pve2:~# vgs
  VG  #PV #LV #SN Attr   VSize    VFree
  pve   1   7   0 wz--n- <475.94g 16.00g
  1. Node3:
Code:
root@pve3:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,backup,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

zfspool: test
        pool rpool/data
        content images,rootdir
        nodes pve1
        sparse 0

Code:
root@pve3:~# pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.0-1
proxmox-backup-file-restore: 3.2.0-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.1
pve-cluster: 8.0.6
pve-container: 5.0.10
pve-docs: 8.2.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.5
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2

Code:
root@pve3:~# pvs
  PV             VG  Fmt  Attr PSize    PFree
  /dev/nvme0n1p3 pve lvm2 a--  <475.94g 16.00g

Code:
root@pve3:~# lvs
  LV   VG  Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data pve twi-a-tz-- <348.82g             0.00   0.48                           
  root pve -wi-ao----   96.00g                                                   
  swap pve -wi-ao----    8.00g

Code:
root@pve3:~# vgs
  VG  #PV #LV #SN Attr   VSize    VFree
  pve   1   3   0 wz--n- <475.94g 16.00g

Is it possible with this configuration to have HA? If yes, how can I configure it?
Thanks.
 
Last edited:
Hi,

HA means, that a vm/lxc is automatically moved to another node, if the one they are running on is unavailable.
If your ZFS storages have same name on all nodes, you can enable replication. In case of a node failure the data updated since last replication is lost nevertheless.
 
  • Like
Reactions: UdoB
Thank you @fba but I created a storage named test in node1, then I created a zfs disk on the other nodes named also test but did not add it to storage (according to this video).
I run a container on node 1 and I configure replication on nodes 2 and 3.
When the replication starts, I have the following logs:
Code:
2025-02-17 11:10:03 101-0: start replication job
2025-02-17 11:10:03 101-0: guest => CT 101, running => 0
2025-02-17 11:10:03 101-0: volumes => test:subvol-101-disk-0
2025-02-17 11:10:04 101-0: (remote_prepare_local_job) storage 'test' is not available on node 'pve2'
2025-02-17 11:10:04 101-0: end replication job with error: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve2' -o 'UserKnownHostsFile=/etc/pve/nodes/pve2/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.1.122 -- pvesr prepare-local-job 101-0 test:subvol-101-disk-0 --last_sync 0' failed: exit code 255
If I run the command I see in the logs, I have this output: storage 'test' is not available on node 'pve2'
And I'm not able to create a zfs storage on nodes 2 and 3, I have the following error when I attempt Option 'pool' (test) does not match existing storage configuration 'rpool/data' (500).
 
Last edited:
If you do not make the ZFS available on all three nodes, it will not work. The video you linked contains an error at this point. Did you notice, that suddenly after the ZFS Storage creation all nodes show a "Storage" entry? This wasn't the case, after the steps for creating them.
Have a look at this comment in the videos where this is pointed out: https://www.youtube.com/watch?v=FZnVmt_DvUk&lc=Ugw74J-BAbDyGNJPeMd4AaABAg
In short: the "Add Storage" checkbox must be enabled, when creating the ZFS Storage on all three nodes, not just the first one.
 
Last edited:
Indeed, comments state that something is going wrong.
However, I'm still not sure how to achieve HA with my services.
 
You can add existing ZFS pools by navigating to Datacenter > Storage > Add > ZFS > *select zfs pool*
Afterwards the enabled replication might have a chance to actually work.
For HA: what is the question?
 
Thanks @fba for your response.
I'm a little bit lost.
On pve1 where I have only one disk, the only ZFS disk I have has been created during the installation of proxmox on this node and is automatically named rpool.
On pve2, I used the second empty disk to create a ZFS disk (I can't say if it's a pool) named test2
On pve3, I did not create any ZFS disk and my second disk is available.

When I go to Datacenter > Storage > Add > ZFS, the only zfs pool I have is test2. I create then a zfs storage named zfs-test which appears then in pve1 and pve2 storages with an unknown status. A question here: this storage is physically only on pve2?

Then, if I try to replicate one CT running on pve2 on this newly created zfs-test, then I have the following error: missing replicate feature on volume 'local-lvm:vm-102-disk-0' (500).
 
Then, if I try to replicate one CT running on pve2 on this newly created zfs-test, then I have the following error: missing replicate feature on volume 'local-lvm:vm-102-disk-0' (500).
The container must be stored on a ZFS storage for the replication. The current storage is a LVM. You need to move the disk.
  1. stop the container,
  2. select Resources tab of container,
  3. select the disk
  4. use Volume Action drop-down menu to move it to the ZFS.

If you make a single pool available on multiple nodes, you do not have a running replication, because there is still only a single place to store the data. If the node where the pool exists on has an outtage the pool is unusable.
You need a second pool of same name like first on another node. In your case: On your third node a pool with name "test2" should be created.
 
  • Like
Reactions: Johannes S
Hi,

Thanks @fba for your help.
I think my only option with this conf is to make a fresh install on pve2 and pve3 and use zfs. Thus, I'll have a zfs pool with the same name on the 3 nodes and then use it.
I have then moved all my VMs and CTs to pve2 then reinstalled pve1 and pve3 with zfs option.
But now, I'm no more able to migrate, from the UI, my CTs and VMs to neither pve1 and pve3.
I have the following error:
Code:
2025-03-01 21:46:57 starting migration of CT 104 to node 'pve3' (192.168.1.123)
2025-03-01 21:46:57 found local volume 'local-lvm:vm-104-disk-1' (in current VM config)
2025-03-01 21:46:57   Volume group "pve" not found
2025-03-01 21:46:57   Cannot process volume group pve
2025-03-01 21:46:57 command '/sbin/lvs --separator : --noheadings --units b --unbuffered --nosuffix --config 'report/time_format="%s"' --options vg_name,lv_name,lv_size,lv_attr,pool_lv,data_percent,metadata_percent,snap_percent,uuid,tags,metadata_size,time pve' failed: exit code 5
2025-03-01 21:46:57 command 'dd 'if=/dev/pve/vm-104-disk-1' 'bs=64k' 'status=progress'' failed: got signal 13
2025-03-01 21:46:57 ERROR: storage migration for 'local-lvm:vm-104-disk-1' to storage 'local-lvm' failed - command 'set -o pipefail && pvesm export local-lvm:vm-104-disk-1 raw+size - -with-snapshots 0 | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve3' -o 'UserKnownHostsFile=/etc/pve/nodes/pve3/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.1.123 -- pvesm import local-lvm:vm-104-disk-1 raw+size - -with-snapshots 0 -allow-rename 1' failed: exit code 5
2025-03-01 21:46:57 aborting phase 1 - cleanup resources
2025-03-01 21:46:57 ERROR: found stale volume copy 'local-lvm:vm-104-disk-1' on node 'pve3'
2025-03-01 21:46:57 start final cleanup
2025-03-01 21:46:57 ERROR: migration aborted (duration 00:00:01): storage migration for 'local-lvm:vm-104-disk-1' to storage 'local-lvm' failed - command 'set -o pipefail && pvesm export local-lvm:vm-104-disk-1 raw+size - -with-snapshots 0 | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve3' -o 'UserKnownHostsFile=/etc/pve/nodes/pve3/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@192.168.1.123 -- pvesm import local-lvm:vm-104-disk-1 raw+size - -with-snapshots 0 -allow-rename 1' failed: exit code 5
TASK ERROR: migration aborted
I created a zfs disk on pve2 which I named rpool, like the zfs disks on pve1 and pve3. I have then moved the storage of one of my CTs to this storage on pve2. Then, I tried to migrate it to either pve1 or pve3, with always the same error:
Code:
2025-03-01 23:06:38 ERROR: migration aborted (duration 00:00:00): storage 'rpool' is not available on node 'pve3'


Thanks.
 
Last edited:
Would you like to share the config of the lxc in question: pct config 104
and the current storage.cfg?
 
I have been able to solve all my issues following my ability to create a zfs storage on my cluster. To be honest, I don't know what I have done compared to the last time but this time, I was able to create this zfs storage which has allowed to change the disk of all my CTs and VMs, add replication, then add them to HA.
Thank you very much for all the time and help you provided all.
 
Last edited:
  • Like
Reactions: Johannes S