igb driver on latest kernel 4.15.17-3-pve - net connections over jumbo frames anomalies

resoli

Renowned Member
Mar 9, 2010
147
4
83
I have recently routinely updated my 3 node pve 5.2 cluster setup, installing new pve-kernel-4.15.17-3-pve .

I started to have problems in syncronizing drbd resources over a link with jumbo frames enabled:


Jun 14 08:59:26 pve1 kernel: [40906.042440] drbd vm-102-disk-1/0 drbd103 pve3: Began resync as SyncSource (will sync 3476 KB [869 bits set]).
Jun 14 08:59:39 pve1 kernel: [40918.313936] drbd vm-102-disk-1 pve3: [drbd_s_vm-102-d/3075] sending time expired, ko = 6
Jun 14 08:59:45 pve1 kernel: [40924.458069] drbd vm-102-disk-1 pve3: [drbd_s_vm-102-d/3075] sending time expired, ko = 5
Jun 14 08:59:51 pve1 kernel: [40930.602191] drbd vm-102-disk-1 pve3: [drbd_s_vm-102-d/3075] sending time expired, ko = 4
Jun 14 08:59:57 pve1 kernel: [40936.746355] drbd vm-102-disk-1 pve3: [drbd_s_vm-102-d/3075] sending time expired, ko = 3​


My configuration uses drbd9 in a dedicated network mesh configuration described here:

https://lists.gt.net/drbd/users/28251#28251

In brief, I put two interfaces in a "drbdbr" bridge on each host, blocking forwarding with ebtables rules.

Each interface has jumbo frames (mtu=9000) enabled.

I suspect the problem is in the update (out of tree) igb driver,

Intel(R) Gigabit Ethernet Linux Driver - version 5.3.5.18​

because in the forum I saw there was problems with jumbo frames in the past.

I reverted back to the previous "pve-kernel-4.15.17-2-pve" kernel with in-tree igb driver:

igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
to restore correct functionality. Unfortunately removing latest kernel has as side effect the removal of proxmox-ve and pve-kernel-4.15 packages.

Any hint?

Thanks,
rob
 
Each interface has jumbo frames (mtu=9000) enabled.
Maybe it helps to lower the MTU, some drivers do not account for the whole frame.

I reverted back to the previous "pve-kernel-4.15.17-2-pve" kernel with in-tree igb driver:
You can set in grub from which kernel you want to boot by default (grub-set-default).
 
I already tried to lower mtu to 1500: solves connection problems, but performance penalty is unbearable.

I know that i can boot with a previous kernel, thanks for the hint: I was'nt aware of "grub-set-default" command; very handy.

My opinion is that in many situations a driver that does not works well with jumbo frames makes the kernel in object useless ...

cheers,
rob
 
I already tried to lower mtu to 1500: solves connection problems, but performance penalty is unbearable.
Lowering moderately, eg. 8800, so that the overhead of the frame fits into the MTU.
 
Sorry, but I do not want to dedicate further time (involving reboot of all nodes) on what seems to me clearly a driver issue. I will follow the 4.15 thread. Please consider this one closed.

Thanks to all,
rob
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!