I have had a strange situation occur with a new cluster I have built.
In order to update the BMC firmware I needed to cold reset each node in turn. I migrated all VMs from node A to node B, checked they were all running, confirmed services on VMs were available as expected and then powered down node A.
Immediately all services running on the VMs became unavailable, all IP addresses were unreachable and while the VMs showed as running on the GUI, the console was unavailable via vncproxy (e.g. VM 149 qmp command 'set_password' failed - unable to connect to VM 149 qmp socket - timeout after 51 retries).
As soon as node A was powered back up everything return to normal...
I have multiple separate networks, eg corosync, migration, data (via vmbr0), ceph, etc, all via separate switches
Any help appreciated
In order to update the BMC firmware I needed to cold reset each node in turn. I migrated all VMs from node A to node B, checked they were all running, confirmed services on VMs were available as expected and then powered down node A.
Immediately all services running on the VMs became unavailable, all IP addresses were unreachable and while the VMs showed as running on the GUI, the console was unavailable via vncproxy (e.g. VM 149 qmp command 'set_password' failed - unable to connect to VM 149 qmp socket - timeout after 51 retries).
As soon as node A was powered back up everything return to normal...
I have multiple separate networks, eg corosync, migration, data (via vmbr0), ceph, etc, all via separate switches
Any help appreciated