PVE 4 HA and redundant ring protocol (RRP)

alitvak69

Renowned Member
Oct 2, 2015
103
2
83
I am testing on a real cluster so I decided to open a new thread to avoid the confusion.

I have RRP (two different networks) configured on corosync. After testing HA in case of network failure I wonder now if it makes sense at all.

When I stop one of the interfaces on node, corosync declares the ring as faulty which is correct.

I have vm running on the same node with a single Ethernet that was bridged to a now faulty ring. Whence I no longer have access to vm.

As result of this test nothing happens with vm even though it lost connection to the world.

What in my opinion should happen, vm needs to be fenced and started other node, since it is no longer accessible on the current one.

Am I wrong ?
 
I am testing on a real cluster so I decided to open a new thread to avoid the confusion.

I have RRP (two different networks) configured on corosync. After testing HA in case of network failure I wonder now if it makes sense at all.

When I stop one of the interfaces on node, corosync declares the ring as faulty which is correct.

I have vm running on the same node with a single Ethernet that was bridged to a now faulty ring. Whence I no longer have access to vm.

As result of this test nothing happens with vm even though it lost connection to the world.

What in my opinion should happen, vm needs to be fenced and started other node, since it is no longer accessible on the current one.

Am I wrong ?

By the sounds of it you are using your cluster network as your LAN to. Is that correct? As far as I know, fencing takes place at the host level not the VM level.
 
Is support for RRP not implemented at all?

It seems so. I cannot migrate vm:101 via second ring.

Issuing migrate to alternative node name fails with (node virt2n1-la-int is not online)

The only method forked for me is to login into the test node add static route to a primary ring network via secondary ring and then issue migrate command to a primary node host name of the target node (virt2n1-la) .
 
By the sounds of it you are using your cluster network as your LAN to. Is that correct? As far as I know, fencing takes place at the host level not the VM level.

Adamb


It sounds correct. However when my physical network card on the node dies it affects VMs using it as well. I may be using completely different network card for cluster network and in that case VMs bridging to it will be affected too.

Wouldn't be migrating / relocating VMs to be a good idea in this case?

I may be wrong but pacemaker with corosync allow to do that.

Also the same issue would affect LXCs exposed to outside world.
 
Adamb


It sounds correct. However when my physical network card on the node dies it affects VMs using it as well. I may be using completely different network card for cluster network and in that case VMs bridging to it will be affected too.

Wouldn't be migrating / relocating VMs to be a good idea in this case?

I may be wrong but pacemaker with corosync allow to do that.

Also the same issue would affect LXCs exposed to outside world.

Yea I don't think RRP is implemented. It is suggested to use a separate network for the cluster network. Im not a fan of fail over's based on network conditions on the LAN. There are just to many variables which could happen and cause fail overs.
 
It sounds correct. However when my physical network card on the node dies it affects VMs using it as well. I may be using completely different network card for cluster network and in that case VMs bridging to it will be affected too.

We always suggest to make all components redundant, i.e. use a bond for the VM network.

I may be wrong but pacemaker with corosync allow to do that.

Yes, pacemaker is a different approach - more flexible, but really complex to configure.
 
I want to agree here.
I agree with your observation . However any interface may and will go down, i.e. bond can go down effectively cutting off vms. If there is a supported RRP feature, would not it make sense to have a migration option?

As far as complexity, a good documents would help to learn as long as solution works. I better take that then no solution.

Sent from my SM-G900V using Tapatalk
 
However any interface may and will go down, i.e. bond can go down effectively cutting off vms. If there is a supported RRP feature, would not it make sense to have a migration option?

Please note that we added the RRP feature 2 days ago. So yes, I would make sense, but it is simply not implemented.
 
Is there a way to vote for migration over RRP ? Also do you have a plan for monitoring resources, i.e. VMs in general. It would be a great feature migrate or reboot VM if it is not accessible by monitor. It can be optional but nevertheless.
 
Hi all!
I have nearly the same question like @alitvak69
I tested rrp in our test environment with ceph. Two physical separate network switches, one for cluster network (the nodes 10.10.10.0) and one for ceph network (172.16.0.0).
I had the same experiences like the others described; if cluster network on one node is broken or offline, the vm still are running (thats good) but not available at the moment. Also the HA configured VMs are not changed to another host - is this possible now to configure? It would be very good, if HA configured VMs automatically move to a node wich is available in the cluster network.

With the ceph network if one ceph connection is broken or offline, the running VMs on this node are available, but in read only mode - is here also maybe a way, that the configured HA VMs automatically change to another host?

Best regards,
Roman
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!