Proxmox network crashes during large network usages

Robert0

New Member
Apr 1, 2021
3
0
1
28
We have a Proxmox set-up of three nodes sharing some 15 VMs. The nodes and VMs are behind an actual router.

This has been happily running for a couple of years now without much trouble, until recently. I wanted to update a box from Debian 9 to 10, and during the `apt upgrade` (which would download about 1 GB) I lost my remote connection. All the other VMs and nodes also became unreachable. After about 30 minutes the whole system came back online.

I tried the upgrade a day later with the exact same result. Then a few days later I wanted to download a backup of a VM to do some local tests, which was also quickly interrupted by a lost connection, again making all the VMs and nodes unreachable.

I've attached sections of the syslog for the second and third crash (from 22-03, ~19:04 and 30-03, ~18:43). In both cases the systems had been running fine for days.

The first actual error appears to be:

Mar 30 18:42:56 bismuth corosync[1318]: error [TOTEM ] FAILED TO RECEIVE
Mar 30 18:42:56 bismuth corosync[1318]: [TOTEM ] FAILED TO RECEIVE

I'm at loss what's happening here, I know how to use Proxmox but I know little of the underlying mechanics.
Maybe this thread would be related: https://forum.proxmox.com/threads/new-cluster-totem-failed-to-receive-after-4mins.58935/, that error is in my logs too.

We're using Proxmox VE 5.2-1. (I know, embarrassingly old. With limited physical access to servers we're having a hard time upgrading.)

I would much appreciate any thoughts on what the problem could be!
 
Thank you for your reply!
What does it mean to put Corosync on a separate physical network? Should I install a network card in all my nodes and connect the secondary ethernet ports together with a switch or something?

Note that the nodes are already pretty exclusive to their own physical network now.
 
We have updated to Proxmox 6.2 (using a clean update). We replaced the SDD of one of the nodes because it gave some errors.
We hoped this would be enough, but the issue of crashing during a download still persists...

We currently lack the hardware to set up such a dedicated network.

But it's still strange, we have used this Proxmox network for over a year without any such problems.

I'll make sure to post once I know more.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!