Flapping Network NICs on Ceph Public Network VLAN

psionic

Member
May 23, 2019
75
9
13
Same port on all 4 nodes, report way longer than able to paste here. This port is used for the Ceph Public Network VLAN...
lsmod | grep -i i40e
i40e 385024 0
root@pve14:~# cat /var/log/messages | grep -i i40e

Jan 2 06:25:54 pve14 kernel: [560724.602777] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 06:25:59 pve14 kernel: [560729.883032] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 06:28:48 pve14 kernel: [560899.428753] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 06:28:54 pve14 kernel: [560904.845987] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 09:21:27 pve14 kernel: [571258.196615] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 09:21:33 pve14 kernel: [571263.697309] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 09:29:25 pve14 kernel: [571735.622712] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 09:29:30 pve14 kernel: [571740.822120] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 10:37:45 pve14 kernel: [575836.036942] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 10:37:50 pve14 kernel: [575840.670088] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 11:10:36 pve14 kernel: [577807.551073] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 11:10:41 pve14 kernel: [577812.509775] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 12:54:53 pve14 kernel: [584064.355090] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 12:54:58 pve14 kernel: [584069.024122] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 12:57:12 pve14 kernel: [584203.393785] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 12:57:23 pve14 kernel: [584214.394146] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 12:57:54 pve14 kernel: [584245.063220] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 12:57:58 pve14 kernel: [584249.594366] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 13:25:59 pve14 kernel: [585929.820363] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 13:26:03 pve14 kernel: [585934.431970] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 13:38:52 pve14 kernel: [586702.826831] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 13:38:56 pve14 kernel: [586707.384546] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 13:46:40 pve14 kernel: [587170.755550] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 13:46:44 pve14 kernel: [587175.267977] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 14:05:10 pve14 kernel: [588280.996974] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 14:05:14 pve14 kernel: [588285.582592] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 14:14:20 pve14 kernel: [588831.248639] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 14:14:25 pve14 kernel: [588835.911664] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 14:23:59 pve14 kernel: [589409.922366] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 14:24:03 pve14 kernel: [589414.691526] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 14:41:28 pve14 kernel: [590459.276575] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 14:41:33 pve14 kernel: [590464.573921] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 14:55:20 pve14 kernel: [591291.399835] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 14:55:25 pve14 kernel: [591296.608964] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 15:04:57 pve14 kernel: [591867.726384] i40e 0000:81:00.3 ens1f3: changing MTU from 1500 to 9000
Jan 2 15:04:57 pve14 kernel: [591867.991374] i40e 0000:81:00.2 ens1f2: changing MTU from 1500 to 9000
Jan 2 15:05:37 pve14 kernel: [591908.002967] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 15:05:42 pve14 kernel: [591912.769570] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 15:28:23 pve14 kernel: [593274.712102] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 15:28:28 pve14 kernel: [593279.142737] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 15:32:27 pve14 kernel: [593518.580157] i40e 0000:81:00.1 ens1f1: changing MTU from 1500 to 9000
Jan 2 15:32:28 pve14 kernel: [593518.848346] i40e 0000:81:00.0 ens1f0: changing MTU from 1500 to 9000
Jan 2 16:00:04 pve14 kernel: [595174.911908] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 16:00:14 pve14 kernel: [595185.149774] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 16:01:06 pve14 kernel: [595237.689978] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 16:01:11 pve14 kernel: [595242.363197] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 16:02:39 pve14 kernel: [595330.168774] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 16:02:49 pve14 kernel: [595340.072508] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 16:12:52 pve14 kernel: [595943.057658] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 16:12:57 pve14 kernel: [595948.445440] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 16:55:00 pve14 kernel: [598470.780156] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 16:55:05 pve14 kernel: [598475.908633] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:40:05 pve14 kernel: [601175.958033] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:40:09 pve14 kernel: [601180.307483] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:40:20 pve14 kernel: [601191.463912] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:40:25 pve14 kernel: [601196.090469] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:40:36 pve14 kernel: [601207.789022] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:40:41 pve14 kernel: [601212.288786] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:41:12 pve14 kernel: [601242.820083] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:41:16 pve14 kernel: [601247.177514] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:41:58 pve14 kernel: [601288.959934] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:42:02 pve14 kernel: [601293.384336] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:42:48 pve14 kernel: [601338.911959] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 17:42:52 pve14 kernel: [601343.536913] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 17:56:41 pve14 kernel: [602172.565277] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 17:56:47 pve14 kernel: [602178.064176] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 19:21:50 pve14 kernel: [607281.528856] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 19:21:55 pve14 kernel: [607285.947416] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 20:36:26 pve14 kernel: [611757.511305] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 20:36:31 pve14 kernel: [611762.822557] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:11:05 pve14 kernel: [613836.789537] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 21:11:11 pve14 kernel: [613842.027478] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:12:20 pve14 kernel: [613911.468602] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 21:12:24 pve14 kernel: [613915.854505] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:12:37 pve14 kernel: [613928.234664] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 21:12:41 pve14 kernel: [613932.779733] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:12:46 pve14 kernel: [613937.868638] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 21:12:51 pve14 kernel: [613942.436384] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:14:51 pve14 kernel: [614062.756078] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 21:14:56 pve14 kernel: [614067.662046] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:27:32 pve14 kernel: [614823.508075] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 21:27:37 pve14 kernel: [614828.881194] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 21:38:53 pve14 kernel: [615504.341827] i40e 0000:81:00.3 ens1f3: NIC Link is Down
Jan 2 21:38:58 pve14 kernel: [615509.004784] i40e 0000:81:00.3 ens1f3: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None
Jan 2 22:26:57 pve14 kernel: [618388.589768] i40e 0000:81:00.2 ens1f2: NIC Link is Down
Jan 2 22:27:02 pve14 kernel: [618393.767598] i40e 0000:81:00.2 ens1f2: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None

Any ideas?
 
Hi,

as far I know the i40e has limited vlan capacity, what can ends in odd behaviors.
please send the network config and the "pvevesions -v" report.
 
Hi,

as far I know the i40e has limited vlan capacity, what can ends in odd behaviors.
please send the network config and the "pvevesions -v" report.

I'm using untagged VLANs on Netgear fully managed switches, all of my ethernet ports are 10G, 2 Corosync Rings, Ceph Public, Ceph Cluster, and 2 for LAN network. Wouldn't all ports have an issue if it was an i40e issue?

I looked a little closer at 'cat /var/log/messages | grep -i i40e' and there are entries for both Ceph eth ports and none for both Corosync eth ports.
All 4 ports are on the same NIC card. Just wondering if is a Ceph specific issue?

cat /etc/network/interfaces

auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual
#Prox-LAN

auto ens1f0
iface ens1f0 inet static
address 10.10.1.11
netmask 24
mtu 9000
#CoroSync-R0

auto ens1f1
iface ens1f1 inet static
address 10.10.2.11
netmask 24
mtu 9000
#CoroSync-R1

auto ens1f2
iface ens1f2 inet static
address 10.10.3.11
netmask 24
mtu 9000
#Ceph-Public1

auto ens1f3
iface ens1f3 inet static
address 10.10.4.11
netmask 24
mtu 9000
#Ceph-Cluster

iface eno2 inet manual
#Spare

auto vmbr0
iface vmbr0 inet static
address 192.168.1.110
netmask 16
gateway 192.168.2.3
bridge-ports eno1
bridge-stp off
bridge-fd 0
#LAN-Bridge

pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve)
pve-manager: 6.1-5 (running version: 6.1-5/9bf06119)
pve-kernel-5.3: 6.1-1
pve-kernel-helper: 6.1-1
pve-kernel-5.0: 6.0-11
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.3.10-1-pve: 5.3.10-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.15-1-pve: 5.0.15-1
ceph: 14.2.5-pve1
ceph-fuse: 14.2.5-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 1.2.8-1+pve4
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.1-2
pve-container: 3.0-15
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-4
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2
 
Last edited:
Can you send these entries from the logs?
I guess it is a Driver bug.
Your config is not complicated so it should work without a problem.

Can you send the output of this command?

Code:
ethtool -i ens1f2
ethtool -k ens1f2
ethtool -S ens1f2
You must install ethtool if it is not installed jet.
 
Can you send these entries from the logs?
I guess it is a Driver bug.
Your config is not complicated so it should work without a problem.

Can you send the output of this command?

Code:
ethtool -i ens1f2
ethtool -k ens1f2
ethtool -S ens1f2
You must install ethtool if it is not installed jet.

see attached file...
 

Attachments

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!