Hi,
I have a small home lab cluster with 3 PVE nodes each of which also serves as a Ceph node. For each, cluster sync and ceph, there is a dedicated 10gbe network (and then there is another dedicated 10gbe network to a PBS which sits outside the cluster).
Now I want to upgrade the network to Infiniband. For this, I need to replace the networking cards in the nodes (and the switch and cables, of course). There are not enough slots to have both cards in at the same time.
My question is: How can I replace the network without disrupting the cluster? Even if everything works at first try (which I don't expect), there will be a period of time during which the nodes will not be able to see each other and could "panic"...
Is there a way to suspend cluster operations for the time of the hardware replacement?
Or what is the best practice in my case?
Thanks!
I have a small home lab cluster with 3 PVE nodes each of which also serves as a Ceph node. For each, cluster sync and ceph, there is a dedicated 10gbe network (and then there is another dedicated 10gbe network to a PBS which sits outside the cluster).
Now I want to upgrade the network to Infiniband. For this, I need to replace the networking cards in the nodes (and the switch and cables, of course). There are not enough slots to have both cards in at the same time.
My question is: How can I replace the network without disrupting the cluster? Even if everything works at first try (which I don't expect), there will be a period of time during which the nodes will not be able to see each other and could "panic"...
Is there a way to suspend cluster operations for the time of the hardware replacement?
Or what is the best practice in my case?
Thanks!