During host restart, shared storage offline before VM migration complete

anwright

New Member
Jan 18, 2024
2
0
1
Hello,

I'm new to Proxmox so I'm not sure exactly how to go about troubleshooting this. I'm also using a custom storage plugin, which could well be the cause, but again I don't know how to go about troubleshooting, because the documentation for storage plugins is...non-existant.

Environment:
  • 3 node cluster
  • PVE 8.1, freshly installed ~3 weeks ago, latest updates (non-subscription) installed just as I'm writing this, to ensure I'm not encountering an already-fixed bug
  • Shared storage using pve-moosefs, which I updated to add support for snapshots of raw files
  • HA settings: shutdown_policy=migrate
  • Migration settings: Default
  • CRS: Default
  • Several VMs running successfully on pve1
  • 1 VM running on pve2, with HA mode set to running
Expected behavior:
When I click the "reboot" button in the web UI for pve2, the running VM should be migrated to one of the remaining two nodes. The VM is using only shared storage, so it should be fairly quick. Once all HA VMs are migrated, shared storage should be taken offline, and the reboot should proceed.

Observed behavior:
Sometimes, but not always (seems to be around 50% of the time), the VM begins showing I/O errors on the console, and the migration fails (error message below). Migration is attempted repeatedly, but never succeeds. pvesm sometimes shows shared storage inactive, despite the fact that there's still a VM using it. However, on a couple of attempts I've seen pvesm show the storage as still enabled and active, along with the web UI, but with the "usage" graph empty (indicating that the web UI can't get the size of the storage).

When this occurs, the only way to "fix" it is to set the HA mode of the VM to ignored, which allows the node to restart. My assumption is that shared storage is being taken offline before the VM migration is complete, which is causing the failure. This seems like something that PVE should understand due to the fact that the VM's disk lives on shared storage, and it should wait to take the storage offline until the VM is migrated. The behavior is also not consistent. Sometimes pvesm shows the storage as enabled and active, but in my most recent test (see below), pvesm says the storage is inactive. However, it's still mounted and I can read/write to it via the web shell.

Any suggestions?


Code:
VM Migration output:

task started by HA resource agent
2024-01-18 16:55:35 starting migration of VM 105 to node 'pve1' (192.168.11.11)
2024-01-18 16:55:35 starting VM 105 on remote node 'pve1'
2024-01-18 16:55:37 start remote tunnel
2024-01-18 16:55:38 ssh tunnel ver 1
2024-01-18 16:55:38 starting online/live migration on unix:/run/qemu-server/105.migrate
2024-01-18 16:55:38 set migration capabilities
2024-01-18 16:55:38 migration downtime limit: 100 ms
2024-01-18 16:55:38 migration cachesize: 512.0 MiB
2024-01-18 16:55:38 set migration parameters
2024-01-18 16:55:38 start migrate command to unix:/run/qemu-server/105.migrate
2024-01-18 16:55:39 migration active, transferred 307.5 MiB of 4.0 GiB VM-state, 2.7 GiB/s
2024-01-18 16:55:40 migration status error: failed
2024-01-18 16:55:40 ERROR: online migrate failure - aborting
2024-01-18 16:55:40 aborting phase 2 - cleanup resources
2024-01-18 16:55:40 migrate_cancel
2024-01-18 16:55:42 ERROR: migration finished with problems (duration 00:00:08)
TASK ERROR: migration problems

Code:
# pvesm status
mfsmaster accepted connection with parameters: read-write,restricted_ip ; root mapped to root:root
Name             Type     Status           Total            Used       Available        %
local             dir     active        63413948         4427048        55733244    6.98%
local-lvm     lvmthin     active       124596224               0       124596224    0.00%
mfs-main      moosefs   inactive               0               0               0    0.00%
zfs-1         zfspool     active       552730624            1236       552729388    0.00%

Web UI on the rebooting node:
screenshot.png
 
Last edited:
By executing Reboot from UI, you are essentially calling /sbin/reboot:

https://github.com/proxmox/pve-manager/blob/master/PVE/API2/Nodes.pm#L608

Since "reboot" is a maintenance operation, you should make it a habit to evacuate the host in advance to get guaranteed results.

If you wanted to test how HA would work on a node loss, you should utilize "ipmitool", if available, to shutdown the host. You may also get similar results with "/bin/systemctl kexec -f -f".

However, keep in mind that the HA event detection in PVE is quiet long, if I am not mistaken. Its possible your host will come up before HA decides to act.

There is also plenty of helpful information in the documentation: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_node_maintenance


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Thanks for the info, it makes sense that "Reboot" through the web interface is really just the "reboot" command. I also get that evacuating a host ahead of a reboot is a good idea, and I'd probably do that in the general case. But I'd still expect that even during a reboot, storage would always stay available until the last VM is shutdown, frozen, or migrated. The reboot command doesn't just stop processes in random order; it's supposed to do things in dependency order, so e.g. storage stays mounted until it's not needed anymore. The node does stay up indefinitely because the VM migration fails, which is what the documentation says will happen. But I/O errors happen on the VM, so if I reboot a node with a running VM (perhaps by accident because I didn't notice the VM was still running), I could end up with filesystem corruption in the worst case.

What I'm trying to determine is:
  • Is this a bug in my storage plugin, where it's reporting offline when it shouldn't?
  • Is this a bug in pve-manager or some other package?
  • Is this expected behavior for reasons that I don't understand (beyond just the fact that I'm calling reboot with running VMs)?