Multiple Clusters destroyed at the same time

MasterTH

Renowned Member
Jun 12, 2009
224
7
83
www.sonog.de
Hi,

yesterday i got a really, really strange issue. i've got 3 different clusters running and yesterday at 20:04 all three of them get corrupted. i asked the datacenter-staff but nothing happened there (no network reboots or something like this), what they could see is, that there was a ddos running onto an ip-adresse of a virtual machine.
Versions are different, two of them has got 2.2 and the third has 2.3

Could a ddos make such problems?

Kind regards
MasterTH
 
Hi,

yesterday i got a really, really strange issue. i've got 3 different clusters running and yesterday at 20:04 all three of them get corrupted. i asked the datacenter-staff but nothing happened there (no network reboots or something like this), what they could see is, that there was a ddos running onto an ip-adresse of a virtual machine.
Versions are different, two of them has got 2.2 and the third has 2.3

Could a ddos make such problems?

Kind regards
MasterTH


What do you mean by destroyed/corrupted ?
 
maybe your network has issues with IP multicast? e.g. your switches dropped multicast traffic (check logs there).
 
It's possible that it was a multicast problem, I have had a similar problem with multicast because of igmp snopping on linux bridges, blocking all multicast traffic.
1 host have sentmulticast crap and have impacted all my differents clusters on same vlan, each linux bridge was blocking multicast traffic.

I have resolved it with echo 0 > /sys/devices/virtual/net/vmbrX/bridge/multicast_snooping or put ip of proxmox on physical interface and not bridge.
 
Hi,

Same problem here : 2 clusters, running PVE v2.2. One production cluster, 5 nodes, one test cluster, 3 nodes.

Upgrade the test one (apt-get update ; apt-get dist-upgrade) to latest version, all seems OK but cman is out of order, stoppable but not restartable (quorum lost, etc).

A reboot of cluster one solved the problem.

BUT production cluster is also out of sync, cman impacted by multicast traffic on the test cluster! WITHOUT any apt-get command.

All VMs are OK, but no cluster stack (no migration, no vzdump, etc).

A reboot cycle needs to be planified, VM needs to be stopped : not good...

Probably a multicast propagation impact, I'm not an expert.

Any advice?

Thanks,

Christophe.
 
ddos was about a minute, then datacenter blocked traffic

That is enough to break corosync cluster communication. You should use a separate network for cluster communication if you expect DOS attacks on the network.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!