Plugged proxmox server into switch - breaks other heartbeat servers

squeeb

New Member
Jan 31, 2011
18
0
1
Hi Guys,

I have a very strange issue here that hopefully some of you know the answer to.

I have 2 x proxmox 2.1 servers.
They have two interfaces each, eth0 is bridged to vmbr0 and eth1 is just a peer to peer link to the other proxmox server via crossover cable.

Server 1:
Hostname vz1
eth0: raw interface, no IP
vmbr0: 192.168.1.1/24
eth1: 10.99.99.1/30

Server 2:
Hostname vz2
eth0: raw interface, no IP
vmbr0: 192.168.1.2/24
eth1: 10.99.99.2/30

they work fine on their own switch, but i moved them into our datacenter with a couple of servers running Heartbeat v1 providing high availability NFS.

Fileserver 1:
Hostname: NFS1
eth0: 192.168.1.10/24
eth0:1 192.168.1.20/24 (floating IP, heartbeat configured)
eth1: 10.99.99.1/30

Fileserver 2:
Hostname: NFS2
eth0: 192.168.1.11/24
eth1: 10.99.99.2/30

The two file servers are in multicast group 239.0.0.1 on eth0

As soon as I plug either proxmox box into the same network as the file servers, Heartbeat on the file servers says it lost connectivity to the other fileserver and tries to take over the resources (Both of them do this at the same time and all manor of hell occurs).

as soon as I remove the proxmox servers from the network physically, heartbeat on the file servers resumes normally and everything is fine.

Here are the steps I have tried so far:

* Reconfigured proxmox's cluster multicast IP to 239.0.0.2 on eth1 so it's completely separate from heartbeat on the file servers
* Stopped cman on both proxmox servers
* Plugged one proxmox server in at a time
* Removed IP addresses on both proxmox server's vmbr0 interface and plugged them in one at a time.

With each of the above steps, heartbeat on the file servers decided it had lost contact with it's other partner and tried to assume the resources.

I have run out of ideas here.

I have a hunch it's something to do with the bridge interface, perhaps interfering with multicast traffic.

Initially the switch they were plugged into (A Cisco 3560) was configured so that their ports were vlan trunks with their native vlan set to the 192.168.1.0/24 subnet.
I set the ports to access ports instead to no avail.
I also disabled spanning tree on those ports to see if that was the issue, still no avail.

Can somebody shed some light on this issue? I would be very grateful :)

Regards,
Squeeb
 
Interestingly, both of the file servers can ping each other constantly while the failure is occurring - this would indicate it's a problem with multicast traffic on the subnet being modified or altered by the proxmox boxes when they are plugged into the network, I'm sure it's something to do with the bridged interface but I have no idea how to test this.
 
Ok, some more progress.

I set up a couple of old Dell 1850's in a DRBD / Heartbeat v1 / NFS configuration just like we do at the datacenter and they worked fine.

I then booted up a single Proxmox 2.1 server and attached it to the same switch on the same vlan / subnet and watched the HA log files on the drbd servers.

As soon as the interface became live on the proxmox servers, both HA servers flipped out thinking that their respective partner had disconnected. .


INTERESTINGLY though, after a couple of minutes they returned to a active/standby mode as they should do under normal circumstances.

I'm sure it's something to do with the bridge configuration.. nnnngggg!!!
 
do you use identical IP multicast addresses somewhere?
 
Not on this subnet no. The HA file servers are using eth0,239.0.0.1

I configured /etc/pve/cluster.conf to use 239.0.0.2

I verified that the new multicast address was being used by restarting and then doing netstat -lpnu and sure enough, corosync was listening on 239.0.0.2

However this didn't help and the problem occurred even when corosync isn't running and no multicast address is present in netstat.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!