I run a 3-node PVE with CEPH.
I migrated all VMs away from node 3, upgraded to the latest CEPH (Quincy) and then started the PVE 7 to 8 upgrade on node 3.
After rebooting node 3 (now PVE 8), everything seemed to work well. So I migrated two VMs, one each from node 1 (still on PVE 7) and node 2 (also (still on PVE 7) over to node 3. Once the migrating task was done, these two VMs showed almost 100% CPU load, one became immediately unresponsive in the GUI console and the other did a kernel panic. I fear this would happen with other VMs once I migrate them.
In addition, when I click "Details" on one of the OSDs displayed in the PVE GUI under "Ceph" -> "OSD", I get the message (e.g.) "OSD '15' does not exist on host 'node3' (500)". This happens with all OSDs and from each of the node's GUIs. However, the OSD list looks like it did before and the CEPH cluster is healthy.
I am wondering what went wrong here and if that is related to the VMs crashing and how I can fix this.
I wanted to upgrade PVE upgrade node by node and migrate the VMs to one of the respective nodes after the upgrade is complete so that the VMs do not have downtime.
Any help is greatly appreciated.
I migrated all VMs away from node 3, upgraded to the latest CEPH (Quincy) and then started the PVE 7 to 8 upgrade on node 3.
After rebooting node 3 (now PVE 8), everything seemed to work well. So I migrated two VMs, one each from node 1 (still on PVE 7) and node 2 (also (still on PVE 7) over to node 3. Once the migrating task was done, these two VMs showed almost 100% CPU load, one became immediately unresponsive in the GUI console and the other did a kernel panic. I fear this would happen with other VMs once I migrate them.
In addition, when I click "Details" on one of the OSDs displayed in the PVE GUI under "Ceph" -> "OSD", I get the message (e.g.) "OSD '15' does not exist on host 'node3' (500)". This happens with all OSDs and from each of the node's GUIs. However, the OSD list looks like it did before and the CEPH cluster is healthy.
I am wondering what went wrong here and if that is related to the VMs crashing and how I can fix this.
I wanted to upgrade PVE upgrade node by node and migrate the VMs to one of the respective nodes after the upgrade is complete so that the VMs do not have downtime.
Any help is greatly appreciated.
Last edited: