All three of my proxmox servers restart unexpectedly.

Note: The other day I realized that the fiber optic switch restarted for no apparent reason.
That would mean a downtime of the link equal to the boot time of the switch. Could be long enough to trigger corosync timeouts.

Overheating? One of the companies where I worked had switches with no sensors on fan rotation and perhaps temperature; they only found out the fan was clogged ( not on fire...) when the network performance went down because one of the switches reached a critical temperature.
 
Hi.

I have a server in the central part of the company and it has two network cables that connect to a GigaBit port on a Cisco switch and a connection cable comes out from there.
fiber optic that connects to a fiber optic switch Cisco.
The other two servers are in two different buildings and the connection theme is identical to that of the first server.

Note: The other day I realized that the fiber optic switch restarted for no apparent reason.
Hi José,

I just want to confirm my understanding, correct me in any point where I am wrong:
1) Three buildings, each one holds exactly one node (server), in your logs it was marked srv1, srv2, srv3;
2) in each of those 3 cases, your network setup is identical, meaning:
3) from a node there are 2 metallic GbE connections to the same switch "in the cabinet" for that particular node - these are not link aggregate, they are completely different subnets and routed separately later on;
4) from the switch "in the cabinet", you have a OS2 (?) fibre to another switch "for the inter-building connections";
5) this second switch "for the inter-buidling connections" is located at the central building, the other two buildings also connect to it via OS2(?) fibre;
6) the inter-building connections are your own fibre cable, at no point does the traffic route through the public internet, not even via VPN.

Correct me in the above where I got it wrong - I am sure I would not have guessed it all correctly. :)

Also can you post ping times inbetween the srv[1-3]? Meaning from srv1 to srv2, from srv1 to srv3, BUT ALSO from srv2 to srv3?

Also, as you have two interfaces for each node, can you do the ping for both of the subnets?

Finally, if you can afford to do that, can you keep one of the pings running (e.g. srv1 to srv2) while you manually reboot the switch "for the inter-building connections" and post it as well?
 
Last edited by a moderator:
That would mean a downtime of the link equal to the boot time of the switch. Could be long enough to trigger corosync timeouts.
This might be a wrong problem to be solving in relation to the cluster health, if HA is the goal and it all routes through a single point of failure anyhow located in the central building, the cluster would have been better off with all three nodes in the same rack. For disaster recovery, off-site backups. But having that one main star-topology central switch at any single point means that there's effectively no HA and the reboots are just symptoms of (not only) failing switch, but also suboptimal topology for HA.
 
Last edited by a moderator:
Hi José,

I just want to confirm my understanding, correct me in any point where I am wrong:
1) Three buildings, each one holds exactly one node (server), in your logs it was marked srv1, srv2, srv3;
2) in each of those 3 cases, your network setup is identical, meaning:
3) from a node there are 2 metallic GbE connections to the same switch "in the cabinet" for that particular node - these are not link aggregate, they are completely different subnets and routed separately later on;
4) from the switch "in the cabinet", you have a OS2 (?) fibre to another switch "for the inter-building connections";
5) this second switch "for the inter-buidling connections" is located at the central building, the other two buildings also connect to it via OS2(?) fibre;
6) the inter-building connections are your own fibre cable, at no point does the traffic route through the public internet, not even via VPN.

Correct me in the above where I got it wrong - I am sure I would not have guessed it all correctly. :)

Also can you post ping times inbetween the srv[1-3]? Meaning from srv1 to srv2, from srv1 to srv3, BUT ALSO from srv2 to srv3?

Also, as you have two interfaces for each node, can you do the ping for both of the subnets?

Finally, if you can afford to do that, can you keep one of the pings running (e.g. srv1 to srv2) while you manually reboot the switch "for the inter-building connections" and post it as well?
hi Esyy.

In relation to point 3, we have 2 network connections, 1 for internal communication of the 3 servers, and the second for administrative purposes.
All fiber optic cables reach a single fiber optic switch in the data center.
Points 4, 5, and 6 are correct.

Regarding the pings on the internal network between servers (srv1-srv2-srv3), the maximum is 6 ms. The pings from the administrative network to the three servers saw a maximum of 3 ms.
 
Hi José

All fiber optic cables reach a single fiber optic switch in the data center.

So this is what I suspected, that means, basically all three nodes are completely relying on that one core switch to function, your high availability setup protects from e.g. single fibre cable cut, but if the core switch, like in this case, has any issue, your cluster falls apart - this is by design, the three nodes lost connection to each other and they are basically trying to fence off.

Each node on their own know nothing better than their connection to the other two was severed and they might be the issue, so they as well might reboot as they were not allowed to provide any services. This is a behaviour you actually want from a node and helps with cases when you have a problem with that one particular node or its connection out, but not the rest. It helps to keep the cluster healthy.

We could of course go and tweak the setup now so that it is tolerant of your core switch rebooting, or you can go troubleshoot the core switch, but in relation to the cluster, I think it helped you discover your topology for operating a cluster with 3 nodes being the tips of a star with the core switch at the centre does not provide high availability at all.

When you think of it, it is worse than having high availability turned off, because in that case your nodes would not be rebooting, just the services will be momentarily only available in the particular building the respective node runs in.

Besides troubleshooting the switch, which even if fixed, does not change your topology, of course you could make it more redundant (but then more expensive) or the cheapest solution would be to put 2 of the 3 nodes together in the same building, I would choose building which needs the services the most (not necessarily the data centre). That way, if you had any issue with the core switch, the third node would fence off, but the two nodes you will have running together will provide working services from the cluster (to anyone on the network who can connect).

I would also add that it would be possible to make the interconnection between such two collocated nodes redundant zero cost, because basically you can run an extra connection directly between the two of them.

Another option is even to run all three nodes in the same building, but that is not the best from the viewpoint of having no offsite replicas at all. In a perfect world, you would have 1 or 2 more core switches, but that's more for a network forum, not clustering Proxmox.

Regarding the pings on the internal network between servers (srv1-srv2-srv3), the maximum is 6 ms. The pings from the administrative network to the three servers saw a maximum of 3 ms.

That's all good then, of course only when it works. :)
 
Last edited by a moderator:
Hi José

I am running Version: 3.1.7-pve3 on Proxmox 8.0.4, so there is a newer version. For the 3.1.5 that you have on your system, there was a BUG reported that has since been fixed, and it had very similar symptoms like you reported. I noted you Proxmox VE 7 version. I do not know if simple apt update && apt upgrade can get you a more recent version or if Proxmox 8 is needed for that, but if you update and are still on version 3.1.5 of corosync, you may consider if it is worth for you to upgrade to Proxmox 8. But it may (but also may not) resolve your issue.
 
I would also add that it would be possible to make the interconnection between such two collocated nodes redundant zero cost, because basically you can run an extra connection directly between the two of them.

Actually - you'll probably find that its common to have multiple nodes shoved into a switch relying on that. For HA of course, this is no good because loss of that switch will mean all of your nodes start rebooting. For me, this is where Proxmox documentation could be better - it makes it very easy to set the system up this way but not easy at all to figure out how you might set up your nodes to have direct links between them to avoid a fencing match.

I have a couple of systems exactly like that, two co-located and a third elsewhere in the building, mainly in case of catastrophes like fire, flood or both caused by forklift tuck through the wall into the server cabinet (Seriously)

Often servers have 4 or more network ports so could easily connect 2 or 3 together. I would like to know how to set this up. I also have a replication network which could be used to communicate host health.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!