iSCSI Host shutdown issues

adamb

Famous Member
Mar 1, 2012
1,329
77
113
Hey all, running into a bit of an odd issue with iSCSI and proxmox that I can't seem to pin down.

If a VM running on a host has high IO at the same time the host is rebooted we are running into some issues.

- Most of the time it will simply throw errors like below, then finally reboot
blk_update_request: I/O error, dm-6, sector 128194272
blk_update_request: I/O error, dm-6, sector 128522736
Buffer I/O error on dev dm-6 logical block 104857584, async page read

- In some rare situations it will throw the above errors and just keep looping and never actually reboot. This has been very hard to reproduce, but it does happen on rare occasions.

It seems like a timing issue of sorts. Almost as if the IO is still taking place when iSCSI is logged out. Or networking is stopping before open-iscsi? We only use CentOS7 as our VM's and acpi is installed and working properly.
 
Last edited:
Also forgot to mention. Running latest proxmox.

root@testprox2:~# pveversion -v
proxmox-ve: 4.4-87 (running kernel: 4.4.59-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-2-pve: 4.4.35-79
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.19-1-pve: 4.4.19-66
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.40-1-pve: 4.4.40-82
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-99
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80
 
Doing more testing and it definitely seems that the VM is not given enough time to shutdown before the host shutsdown.

Watching iotop from the host and I can clearly see high IO taking place right up until the last second when the host reboots. Im watching the process of the host server shutting down on the iLO port over remote console.

2-3 seconds before the host finally reboots is when the "blk_update_request" and "Buffer I/O error" start scrolling and at that time I still have a SSH connection to the VM and its still cranking away.

How can I ensure the VM is allowed the time to shutdown before the host? Acpid is installed on the VM and working as expected when I issue "shutdown" from the proxmox GUI.
 
I think I just figured something out.

It looks like the "Bulk" action "Stop all VM's and containers" is used during a host reboot/shutdown. When I manually test this option it does not work at all. The VM stay's running and never stops. If I select the VM itself and use "Stop" or "Shutdown" the VM is gracefully shutdown as expected which tells me acpid is working in the VM.
 
Further testing has shown that the "Bulk" actions work as expected when the VM is NOT part of HA. If its part of HA, the bulk shutdown does not gracefully shutdown the VM.

Our issue and errors are resolved as long as the VM is properly shutdown. Really stumped as to why the VM doesn't shutdown when its part of HA?
 
Further testing has shown that the "Bulk" actions work as expected when the VM is NOT part of HA. If its part of HA, the bulk shutdown does not gracefully shutdown the VM.

Our issue and errors are resolved as long as the VM is properly shutdown. Really stumped as to why the VM doesn't shutdown when its part of HA?
Maybe the reason is that proxmox has introduced an async two-step HA shutdown and the bulk shutdown does not wait for the stopped signal from the async two-step shutdown?
 
Maybe the reason is that proxmox has introduced an async two-step HA shutdown and the bulk shutdown does not wait for the stopped signal from the async two-step shutdown?

Sounds like a great guess and makes sense as the Bulk stop returns within 1 second of being issued. Hoping the dev's can provide some insight.

Surprised I am the only one that has noticed this issue. If this is truly the issue then no VM's are gracefully shutdown when a host is. That can't be good!
 
Surprised I am the only one that has noticed this issue. If this is truly the issue then no VM's are gracefully shutdown when a host is. That can't be good!
For once I never use the bulk action since I have made my own scripts which is more feature full.
Second, I am convinced that the host shutdown uses its own bulk operation to shutdown VM's since the gui bulk action does not take into consideration configured order of shutdown - it simply shuts down in VM numeric order.
 
there is an issue with the pve-ha-lrm service (which is among other things, responsible for orderly shutdown and freezing of HA-managed guests on hypervisor reboot/shutdown) being stopped too late in the shutdown cycle.

the fix is already committed in git, but not yet released:
https://git.proxmox.com/?p=pve-ha-manager.git;a=commit;h=2f549281f26a2a183d9d18ca0c91e286c0c0a3da
https://git.proxmox.com/?p=pve-manager.git;a=commit;h=a0a3cd4112d5e9e347bb77866b1d106b6f910341

a simple test whether this is the cause of your issue would be to add the following file as "after-pveproxy.conf" in /etc/systemd/system/pve-ha-lrm.service.d/ (since you are not using ceph, the other changes are not needed for testing)
Code:
[Unit]
After=pveproxy.service
Wants=pveproxy.service

followed by "systemctl daemon-reload"

edit: you should remove this file after testing, at the latest when the above commits hit the system after upgrading. technically the dependency on pveproxy is too strict, which is why we split the storage dependencies into their own unit in the referenced commits.
 
Here are more findings.

I have re-ordered the shutdown like so.

Stop Order
pve-ha-lrm
pveproxy
multipath
open-iscsi

HA managed VM is still running until the very last second the server is powered on.

I created a quick systemd service which runs a script at shutdown before multipath-tools which uses qm to suspend the VM. With this in place I have no issues on shutdown.
 
Does anyone know if there is a way to test "freezing" a VM just like pve-ha-lrm does on a HA VM during shutdown? I clearly see pve-ha-lrm setting the VM to freeze and the VM still running.

UPDATE: Looks like I can achieve this be simply stopping pve-ha-lrm. Shows the VM as a "freeze" state but the vm is definitely not froze.

root@testprox1:~# ha-manager status
quorum OK
master testprox3 (active, Fri May 26 07:24:01 2017)
lrm testprox1 (old timestamp - dead?, Fri May 26 07:22:23 2017)
lrm testprox2 (active, Fri May 26 07:24:03 2017)
lrm testprox3 (idle, Fri May 26 07:24:04 2017)
service vm:100 (testprox1, freeze)
service vm:102 (testprox2, started)
 
Last edited:
Thank you for the report. This is clearly a bug in the LRM handling of reboots (instead of stopping guests, or stopping guests and freezing, it just freezes), and not related to any missing bits in the shutdown order. As a workaround, you can shutdown and start nodes instead of rebooting. Updated packages with a fix will be released soon.