Proxmox VE 6.2 prevent last nodes to reboot with HA enabled

tawh

Active Member
Mar 26, 2019
23
0
41
35
I have three node cluster with CEPH and DRBD installed as shared storage (CEPH is used entirely for the linstor controller, a long story to tell and not discussed in this thread).
I created two VMs, one "vm:100" on CEPH (as said, hosting linstor controller) and one "vm:110" on DRBD (appliance by debian OS).
everything works perfectly, such as manual migration, CEPH and DRBD synchronization, VM manual bootup and shutdown.

I set HA with the following customized configuration:
Under GUI Datacenter > Options
Code:
Migration Setting: network=172.16.0.1/24, type=secure
HA Settings: shutdown_policy=migrate
Under GUI Datacenter > HA > Groups
Code:
name: primary, restricted: yes, nofailback: no, Nodes: pve1, pve2 (Without pve3)
I put two VMs under the HA with "started" state.
Code:
ID: vm:100, State: stated, Max. Restart: 3. Max. Relocate: 3, Group: primary
ID: vm:110, State: stated, Max. Restart: 3. Max. Relocate: 3, Group: primary

Problem encountered:
All VMs resides on pve1, Shutdown pve1 initiated, all VMs migrated to pve2 without any problem.
Then I shutdown pve2, but It cannot reboot and wait forever.

the following messages appeared on the console of pve2:
Code:
A stop job is running for PVE Local HA Resource Manager Daemon
From the GUI, I don't see any shutdown triggered to the VMs, the VM are still running.
I tried to do the manual shutdown from the OS. However, the HA manager then kicks up the VM again.
If I click the shutdown button for the VM on the proxmox webGUI, the VMs go down, the node can reboot, but the VMs don't started after the reboot as the state in the HA changed to "stopped"

Initially, I suspect the DRBD caused this issue, so I removed vm110 from the HA. But the problem still exists.
I noticed the issue of ifupdown2 as described in this forum, but I checked that my side don't use ifupdown2

Attached is my version of pve on 3 nodes.
proxmox-ve: 6.2-1 (running kernel: 5.4.34-1-pve)
pve-manager: 6.2-4 (running version: 6.2-4/9824574a)
pve-kernel-5.4: 6.2-1
pve-kernel-helper: 6.2-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.3
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-2
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-1
pve-cluster: 6.1-8
pve-container: 3.1-5
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-2
pve-qemu-kvm: 5.0.0-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
 
Last edited:
Today I tried on the third node (pve3), a VM installed in the local-lvm and put it to the HA resource manager, I created a HA group which only contain pve3.

When I click "shutdown" over the pve3, the log in the GUI showed "Stop all VMs and Containers" and then wait forever again. On the console, it also shows "A stop job is running for PVE Local HA Resource Manager Daemon (??min ??s / no limit)

It seems the latest build has serious problem on the shutdown procedure if HA is enabled.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!