I've running the latest PVE 6.x on 2-host cluster:
Physical servers aren't the same, but very similar.
And I had weird issues with networking from the LXC containers. There are lost packets when there is some constant fast network flow going on. Casual ping works fine without any lost packets, but this commands gives such issue when ran from the container:
Sometimes it's less than 5%, but still unacceptable. I've just did a clean creation of new Centos 7 container from templates and getting the same issue.
Host node has the IP in the same subnet, on the same vmbr0 devices and it runs same ping command without any lost packets:
And another weird thing is once I migrate any container with such issue to another node in the cluster, which has the similar network configuration it starts to run smoothly without lost packets just as above:
To add just more weirdness seems like same thing doesn't happens with Debian based containers.
Host node networking are LACP bonding 10G ethernet devices on both nodes.
Node1 (with issues) config:
And node2 (without issues):
Both running Intel X520 NIC, default ixgbe driver. So far I've failed to find the differences between the configs and other things. And seems like the network itself are fine as host node pings normally and debian-based containers too.
/proc/net/bonding/bond0 doesn't shows any errors too.
Any ideas?
Code:
pve-manager/6.4-13/9f411e79 (running kernel: 5.4.128-1-pve)
And I had weird issues with networking from the LXC containers. There are lost packets when there is some constant fast network flow going on. Casual ping works fine without any lost packets, but this commands gives such issue when ran from the container:
Code:
# ping -f -s 972 -M do -i 0.00191 -c 1000 -Q 5 172.16.x.x
PING 172.16.x.x (172.16.x.x) 972(1000) bytes of data.
..........................................................
--- 172.16.x.x ping statistics ---
1000 packets transmitted, 942 received, 5% packet loss, time 1525ms
rtt min/avg/max/mdev = 0.123/0.208/0.279/0.025 ms, ipg/ewma 1.527/0.203 ms
Host node has the IP in the same subnet, on the same vmbr0 devices and it runs same ping command without any lost packets:
Code:
--- 172.16.x.x ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.119/0.207/0.290/0.020 ms, ipg/ewma 0.999/0.203 ms
Code:
--- 172.16.x.x ping statistics ---
1000 packets transmitted, 1000 received, 0% packet loss, time 1072ms
rtt min/avg/max/mdev = 0.117/0.196/0.253/0.027 ms, ipg/ewma 1.073/0.174 ms
Host node networking are LACP bonding 10G ethernet devices on both nodes.
Node1 (with issues) config:
Code:
/etc/network/interfaces:
auto bond0
iface bond0 inet manual
bond_mode 802.3ad
bond_miimon 100
bond_downdelay 200
bond_updelay 200
slaves ens1f0 ens1f1
auto vmbr0
iface vmbr0 inet static
address 172.16.x.x
netmask 255.255.255.0
gateway 172.16.x.1
bridge_ports bond0
bridge_stp on
bridge_fd 0
post-up ip route add 172.16.0.0/19 via 172.16.2.254
And node2 (without issues):
Code:
/etc/network/interfaces:
auto bond0
iface bond0 inet manual
bond-slaves enp1s0f0 enp1s0f1
bond-miimon 100
bond-mode 802.3ad
bond_downdelay 200
bond_updelay 200
auto vmbr0
iface vmbr0 inet static
address 172.16.x.x/24
gateway 172.16.x.1
bridge-ports bond0
bridge-stp on
bridge-fd 0
post-up ip route add 172.16.0.0/19 via 172.16.2.254
Both running Intel X520 NIC, default ixgbe driver. So far I've failed to find the differences between the configs and other things. And seems like the network itself are fine as host node pings normally and debian-based containers too.
/proc/net/bonding/bond0 doesn't shows any errors too.
Any ideas?
Last edited: