All nodes rebooted unexpectedly around the same time in a 3 node HA PVE 4.2 cluster

Attila

Active Member
Jun 15, 2016
13
0
41
Hi,

A few days ago we had strange case – all three nodes restarted nearly at the same time, with just a few seconds difference.

Around that time our provider had a DDos attack, that did not affect our cluster, but some servers in the same VLAN were affected.

The DDos started at around 3:40, our servers rebooted around 4:42. (A second reboot happened again around 9:50, also as a consequence of a DDos attack on a negihboring server. Our servers had no big traffic.)
What I assume is that network might have been congested - our servers are connected through a redundant port-channel connection.


In our servers, to our best knowledge there is no HW watchdog, and the only thing we suspect could be the cause is the soft watchdog.

Attaching the logs from the three servers. 00,01 and 02.

What I see is that node00 has some pmxcfs issues since 2:10. This is the time when the backups are executed. Node02 is backed up to node00 between 2:00 and 2:10 (small backup, approx. 30 GB in total, over a redundant 2x1gbit port-channel network).

We are trying to find the cause of the reboots. Can it be the soft watchdog, even if there are no log entries?

I see that watchdog is being started:

syslog.1:Sep 4 04:44:13 hostname00 watchdog-mux[2747]: Watchdog driver 'Software Watchdog', version 0
syslog.1:Sep 4 04:44:13 hostname00 kernel: [ 0.075567] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
syslog.1:Sep 4 04:49:27 hostname00 pve-ha-lrm[3128]: watchdog active



Any help would be appreciated.

Thanks!
 

Attachments

  • logs.zip
    49.8 KB · Views: 7
node1 and node2 does not have any errors (or warnings about lost quorum) in the logs, so IMHO the softdog cannot be the problem ...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!