Something regarding MTU sizes I ran into with my proxmox configuration. I have a SAN bridge (linux bridge) on vmbr1 that includes eno1 (either a bnx2x or igbxe interface, depending on the server) as a member interface. I had played around with MTU sizes and was getting sporadic results.
It turns out it's a timing issue with bringing the physical interfaces up. It's something Debian has had in their bug queue for a while (dating back several years now) and it manifests itself in a variety of ways. Reference thread:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1399064
In my case, the bridge interface was being brought up before the 10Gbe interfaces were fully ready. There's some strange code in the Debian bridge interface up code that clamps a bridge without members to a specific MTU (which is supposed to be 1500 bytes), but in my case, I was getting a LOT of drops on the physical interface after bringing the bridge up. The difference in performance was staggering. Before adding the "pre-up sleep 5" command, I was getting ~18MB/s on my SAN interface. After getting it stabilized, I'm getting almost 400MB/s on the SAN interface. The only way I found I could automate this was by changing the pre-up on the bridge interface. My interfaces files now look like this:
Note the use of "pre-up sleep 5". That's the only way I could stabilize even a 1500 byte mtu. I haven't tested migrating to a 9000 byte mtu yet.
Any alternative suggestions to get this fully automated would be greatly appreciated. I'd also like to get back to 9000 byte PDUs on that interface without having a dramatic impact on all of the sockets. If I force the MTU to 9000 bytes at the command line, the iscsi initiator disconnects and reconnects, but I lose about 250MB/s in performance.
It turns out it's a timing issue with bringing the physical interfaces up. It's something Debian has had in their bug queue for a while (dating back several years now) and it manifests itself in a variety of ways. Reference thread:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1399064
In my case, the bridge interface was being brought up before the 10Gbe interfaces were fully ready. There's some strange code in the Debian bridge interface up code that clamps a bridge without members to a specific MTU (which is supposed to be 1500 bytes), but in my case, I was getting a LOT of drops on the physical interface after bringing the bridge up. The difference in performance was staggering. Before adding the "pre-up sleep 5" command, I was getting ~18MB/s on my SAN interface. After getting it stabilized, I'm getting almost 400MB/s on the SAN interface. The only way I found I could automate this was by changing the pre-up on the bridge interface. My interfaces files now look like this:
Code:
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage part of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface enp3s0f0 inet manual
iface idrac inet manual
auto eno1
iface eno1 inet manual
pre-up ifconfig $IFACE up && ethtool -G $IFACE rx 4096 tx 4096
post-down ifconfig $IFACE down
mtu 1500
auto eno2
iface eno2 inet manual
pre-up ifconfig $IFACE up && ethtool -G $IFACE rx 4096 tx 4096
post-down ifconfig $IFACE down
mtu 1500
iface enp3s0f1 inet manual
auto vmbr0
iface vmbr0 inet static
address 10.1.60.50
netmask 255.255.0.0
gateway 10.1.1.1
bridge_ports enp3s0f0
bridge_stp off
bridge_fd 0
auto vmbr1
iface vmbr1 inet static
address 192.168.210.50
netmask 255.255.255.0
bridge_ports eno1
bridge_stp off
bridge_fd 0
pre-up sleep 5
Note the use of "pre-up sleep 5". That's the only way I could stabilize even a 1500 byte mtu. I haven't tested migrating to a 9000 byte mtu yet.
Any alternative suggestions to get this fully automated would be greatly appreciated. I'd also like to get back to 9000 byte PDUs on that interface without having a dramatic impact on all of the sockets. If I force the MTU to 9000 bytes at the command line, the iscsi initiator disconnects and reconnects, but I lose about 250MB/s in performance.