Balanced XOR Bonding problems with some VPS network connection

openaspace

Active Member
Sep 16, 2019
486
13
38
Italy
Hello.
This morning i was becoming crazy.. all vps and CT was working correctly... exept one ubuntu server vps that was loosing connection.. after crazy setup on the vps i discovered that the problem is in the proxmox bonding ...

all vps works exept one!!!
..finally I disconnected one ethernet from the bonding and it works!!!

Where is the problem????
prx1-Proxmox-Virtual-Environment(1).png
 
One thing to remember about LACP (xor) bond is that the hashing of connection is normally done based on last Octet of the IP. Take a look at your VM`s IP addresses, - does the one that didnt work differ in obvious way? Is there another channel between the switch where the Hypervisor/VMs are plugged in and destination? If there is - that can also be hashing differently.
In short, somehow that one VM was landing on a path that is not functioning properly. It does not necessarily need to be immediate path within the hosting Hypervisor. It could be port/cable/config issue anywhere along entire network path.
Usually troubleshooting involves trying to isolate specific port/path/channel using pings between IPs that you can predict hash of.
A good test, if you can afford it, change the IP of the "bad" VM (if it was odd - to even, and vice versa), restore the channel and see if that fixes the issue.

Good luck.


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: openaspace
Changing the ip from .50 to .61 worked..

another question, i have upgraded my internet connection to 2,5gbps download, my is an old server DELL T310 with two 1gbps nic's in XOR where now I added a realtek 2,5gbps in the same bond, there is a way to prioritize the 2,5nic and when saturated use the others 1gbps nic's?

For job I receive big raw 4/8k video files at the same time from different clients and i want to avoid that one upload can reach the 1gbps nic instead of the 2,5. :D
One thing to remember about LACP (xor) bond is that the hashing of connection is normally done based on last Octet of the IP. Take a look at your VM`s IP addresses, - does the one that didnt work differ in obvious way? Is there another channel between the switch where the Hypervisor/VMs are plugged in and destination? If there is - that can also be hashing differently.
In short, somehow that one VM was landing on a path that is not functioning properly. It does not necessarily need to be immediate path within the hosting Hypervisor. It could be port/cable/config issue anywhere along entire network path.
Usually troubleshooting involves trying to isolate specific port/path/channel using pings between IPs that you can predict hash of.
A good test, if you can afford it, change the IP of the "bad" VM (if it was odd - to even, and vice versa), restore the channel and see if that fixes the issue.

Good luck.


Blockbridge: Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox