We recently started using Proxmox with shared storage and a few nodes.
These nodes use redundant configuration to the SAN using multipath, but what happens in case both of the paths go down, as from what I understood Proxmox does not have support for SCSI-3 PR (while Linux does). How is this handled by Proxmox? As from what I saw (in a test case scenario) if a host has both of it's connections dropped, the virtual machines just seem to hang indefinitely and die eventually and Proxmox does not seem to reboot the host (I'm using Dell iDRAC hardware based fencing).
How is the storage aware quorum handled (like the VMware datastore heartbeat)?
How is the response handled to automatic storage loss (like VMware VMCP)
How is SCSI-3PR handled (like Red Hat Enterprise Virtualization does using Pacemaker with Stonith Block Device)
Is any of this handled at all? As it doesn't seem like it when testing.
For now I've resolved this using a script with a systemd service and systemd timer which runs every 30 seconds, which checks all paths and if all paths are down increments a counter and when 3 are it it will trigger a echo c /proc/sysrq-trigger which causes a kernel panic, so the node will hard reboot via the IPMI watchdog.
I'd like to know if there's a better / official supported way to configure this though, as I couldn't find anything concrete on this while searching online.
These nodes use redundant configuration to the SAN using multipath, but what happens in case both of the paths go down, as from what I understood Proxmox does not have support for SCSI-3 PR (while Linux does). How is this handled by Proxmox? As from what I saw (in a test case scenario) if a host has both of it's connections dropped, the virtual machines just seem to hang indefinitely and die eventually and Proxmox does not seem to reboot the host (I'm using Dell iDRAC hardware based fencing).
How is the storage aware quorum handled (like the VMware datastore heartbeat)?
How is the response handled to automatic storage loss (like VMware VMCP)
How is SCSI-3PR handled (like Red Hat Enterprise Virtualization does using Pacemaker with Stonith Block Device)
Is any of this handled at all? As it doesn't seem like it when testing.
For now I've resolved this using a script with a systemd service and systemd timer which runs every 30 seconds, which checks all paths and if all paths are down increments a counter and when 3 are it it will trigger a echo c /proc/sysrq-trigger which causes a kernel panic, so the node will hard reboot via the IPMI watchdog.
I'd like to know if there's a better / official supported way to configure this though, as I couldn't find anything concrete on this while searching online.