Hi all,
about 6h after upgrading to 4.15.18-19-pve we noticed a failing CX4 network adapter showing in dmesg only:
Aug 13 14:31:02 vm4 kernel: [21452.957177] ixgbe 0000:81:00.1: Adapter removed
All other ixgbe network adapters are working. Only after shutting the server off (reboot didn't suffice) and
using kernel 4.15.18-18-pve again same adapter seems to work now for more than a day.
unload/load driver using modprobe didn't work, either (while dropping all other, ixgbe base network connections).
Mainboard is Supermicro X10DRI-T.
Card is 2xCX4, only one connected to an HP6410 switch:
Intel Corporation 82598EB 10-Gigabit AT CX4 Network Connection (rev 01)
AFAIK are no changes in 4.15.18-20 related to ixgbe, so not tested, yet, as node is in production.
So, any idea what I could try to find the reason for the drop, or just stick with 4.15.18-18 for now?
Thanx in advance
Falko
about 6h after upgrading to 4.15.18-19-pve we noticed a failing CX4 network adapter showing in dmesg only:
Aug 13 08:33:37 vm4 kernel: [ 3.297231] ixgbe 0000:81:00.0: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16 XDP Queue count = 0
Aug 13 08:33:37 vm4 kernel: [ 3.297410] ixgbe 0000:81:00.0: PCI Express bandwidth of 16GT/s available
Aug 13 08:33:37 vm4 kernel: [ 3.297412] ixgbe 0000:81:00.0: (Speed:2.5GT/s, Width: x8, Encoding Loss:20%)
Aug 13 08:33:37 vm4 kernel: [ 3.297483] ixgbe 0000:81:00.0: MAC: 1, PHY: 0, PBA No: E37623-004
Aug 13 08:33:37 vm4 kernel: [ 3.297484] ixgbe 0000:81:00.0: 00:1b:21:8d:d8:d3
Aug 13 08:33:37 vm4 kernel: [ 3.309510] ixgbe 0000:81:00.0: Intel(R) 10 Gigabit Network Connection
Aug 13 08:33:37 vm4 kernel: [ 3.409215] ixgbe 0000:81:00.1: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16 XDP Queue count = 0
Aug 13 08:33:37 vm4 kernel: [ 3.409394] ixgbe 0000:81:00.1: PCI Express bandwidth of 16GT/s available
Aug 13 08:33:37 vm4 kernel: [ 3.409396] ixgbe 0000:81:00.1: (Speed:2.5GT/s, Width: x8, Encoding Loss:20%)
Aug 13 08:33:37 vm4 kernel: [ 3.409467] ixgbe 0000:81:00.1: MAC: 1, PHY: 0, PBA No: E37623-004
Aug 13 08:33:37 vm4 kernel: [ 3.409468] ixgbe 0000:81:00.1: 00:1b:21:8d:d8:d2
Aug 13 08:33:37 vm4 kernel: [ 3.421464] ixgbe 0000:81:00.1: Intel(R) 10 Gigabit Network Connection
Aug 13 08:33:37 vm4 kernel: [ 3.544187] ixgbe 0000:81:00.1 ens6f1: renamed from eth4
Aug 13 08:33:37 vm4 kernel: [ 3.572225] ixgbe 0000:81:00.0 ens6f0: renamed from eth3
Aug 13 08:33:37 vm4 kernel: [ 8.470703] ixgbe 0000:81:00.1 ens6f1: changing MTU from 1500 to 9000
Aug 13 08:33:38 vm4 kernel: [ 8.898277] ixgbe 0000:81:00.1 ens6f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
Aug 13 08:33:37 vm4 kernel: [ 3.297410] ixgbe 0000:81:00.0: PCI Express bandwidth of 16GT/s available
Aug 13 08:33:37 vm4 kernel: [ 3.297412] ixgbe 0000:81:00.0: (Speed:2.5GT/s, Width: x8, Encoding Loss:20%)
Aug 13 08:33:37 vm4 kernel: [ 3.297483] ixgbe 0000:81:00.0: MAC: 1, PHY: 0, PBA No: E37623-004
Aug 13 08:33:37 vm4 kernel: [ 3.297484] ixgbe 0000:81:00.0: 00:1b:21:8d:d8:d3
Aug 13 08:33:37 vm4 kernel: [ 3.309510] ixgbe 0000:81:00.0: Intel(R) 10 Gigabit Network Connection
Aug 13 08:33:37 vm4 kernel: [ 3.409215] ixgbe 0000:81:00.1: Multiqueue Enabled: Rx Queue count = 16, Tx Queue count = 16 XDP Queue count = 0
Aug 13 08:33:37 vm4 kernel: [ 3.409394] ixgbe 0000:81:00.1: PCI Express bandwidth of 16GT/s available
Aug 13 08:33:37 vm4 kernel: [ 3.409396] ixgbe 0000:81:00.1: (Speed:2.5GT/s, Width: x8, Encoding Loss:20%)
Aug 13 08:33:37 vm4 kernel: [ 3.409467] ixgbe 0000:81:00.1: MAC: 1, PHY: 0, PBA No: E37623-004
Aug 13 08:33:37 vm4 kernel: [ 3.409468] ixgbe 0000:81:00.1: 00:1b:21:8d:d8:d2
Aug 13 08:33:37 vm4 kernel: [ 3.421464] ixgbe 0000:81:00.1: Intel(R) 10 Gigabit Network Connection
Aug 13 08:33:37 vm4 kernel: [ 3.544187] ixgbe 0000:81:00.1 ens6f1: renamed from eth4
Aug 13 08:33:37 vm4 kernel: [ 3.572225] ixgbe 0000:81:00.0 ens6f0: renamed from eth3
Aug 13 08:33:37 vm4 kernel: [ 8.470703] ixgbe 0000:81:00.1 ens6f1: changing MTU from 1500 to 9000
Aug 13 08:33:38 vm4 kernel: [ 8.898277] ixgbe 0000:81:00.1 ens6f1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
Aug 13 14:31:02 vm4 kernel: [21452.957177] ixgbe 0000:81:00.1: Adapter removed
All other ixgbe network adapters are working. Only after shutting the server off (reboot didn't suffice) and
using kernel 4.15.18-18-pve again same adapter seems to work now for more than a day.
unload/load driver using modprobe didn't work, either (while dropping all other, ixgbe base network connections).
Mainboard is Supermicro X10DRI-T.
Card is 2xCX4, only one connected to an HP6410 switch:
Intel Corporation 82598EB 10-Gigabit AT CX4 Network Connection (rev 01)
driver: ixgbe
version: 5.1.0-k
firmware-version: 0xb5050000
expansion-rom-version:
bus-info: 0000:81:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
version: 5.1.0-k
firmware-version: 0xb5050000
expansion-rom-version:
bus-info: 0000:81:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
proxmox-ve: 5.4-2 (running kernel: 4.15.18-18-pve)
pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
pve-kernel-4.15: 5.4-8
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-19-pve: 4.15.18-45
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-54
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-5
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-40
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
pve-manager: 5.4-13 (running version: 5.4-13/aee6f0ec)
pve-kernel-4.15: 5.4-8
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.15.18-19-pve: 4.15.18-45
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-14-pve: 4.15.18-39
pve-kernel-4.15.18-13-pve: 4.15.18-37
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.15.18-10-pve: 4.15.18-32
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph: 12.2.12-pve1
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-12
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-54
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-14
libpve-storage-perl: 5.0-44
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-5
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
openvswitch-switch: 2.7.0-3
proxmox-widget-toolkit: 1.0-28
pve-cluster: 5.0-38
pve-container: 2.0-40
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-22
pve-firmware: 2.0-7
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-4
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-54
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
AFAIK are no changes in 4.15.18-20 related to ixgbe, so not tested, yet, as node is in production.
So, any idea what I could try to find the reason for the drop, or just stick with 4.15.18-18 for now?
Thanx in advance
Falko