tl;dr: I'm experiencing a very flaky network connection on a Dell PowerEdge R630 with the Intel X710/i350 network daughter card. The node itself is dropping offline and coming back online at (apparent) random, and VMs and containers on the node are also very flaky.
Background: I've had a Proxmox cluster running on three nodes of a PowerEdge C6220 II for a while. It's run reasonably well, but for a variety of reasons, I'm wanting to move those nodes to R630s. So ordered one from eBay configured as desired, moved the boot drives (ZFS mirror boot pool) to the new hardware, edited /etc/network/interfaces to reflect the new interface names, and figured I'd be good to go.
Well, not so much. The first problem I encountered was that the primary interface was down and stayed down. Replacing the SFP+ optic with another one (both Intel-compatible units from fs.com) brought the link up, mostly. But it still drops, and VMs/containers on that system are very flaky.
Not sure where I should be looking. I don't see anything untoward in /var/log/syslog, but I'm not confident I know what to look for. Output of pveversion -v, lspci -v -s 01:00.0, and content of /etc/network/interfaces below:
Background: I've had a Proxmox cluster running on three nodes of a PowerEdge C6220 II for a while. It's run reasonably well, but for a variety of reasons, I'm wanting to move those nodes to R630s. So ordered one from eBay configured as desired, moved the boot drives (ZFS mirror boot pool) to the new hardware, edited /etc/network/interfaces to reflect the new interface names, and figured I'd be good to go.
Well, not so much. The first problem I encountered was that the primary interface was down and stayed down. Replacing the SFP+ optic with another one (both Intel-compatible units from fs.com) brought the link up, mostly. But it still drops, and VMs/containers on that system are very flaky.
Not sure where I should be looking. I don't see anything untoward in /var/log/syslog, but I'm not confident I know what to look for. Output of pveversion -v, lspci -v -s 01:00.0, and content of /etc/network/interfaces below:
Code:
root@pve3 ➜ ~ pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.107-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.4-2
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.104-1-pve: 5.15.104-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
ceph: 17.2.5-pve1
ceph-fuse: 17.2.5-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-4
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.5
libpve-storage-perl: 7.4-2
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.1-1
proxmox-backup-file-restore: 2.4.1-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.6.5
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-1
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
root@pve3 ➜ ~ lspci -v -s 01:00.0
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
DeviceName: NIC1
Subsystem: Dell Ethernet 10G 4P X710/I350 rNDC
Flags: bus master, fast devsel, latency 0, IRQ 48, NUMA node 0
Memory at 91000000 (64-bit, prefetchable) [size=16M]
Memory at 92008000 (64-bit, prefetchable) [size=32K]
Expansion ROM at 92100000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [e0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number a8-56-b8-ff-ff-4b-43-e4
Capabilities: [1a0] Transaction Processing Hints
Capabilities: [1b0] Access Control Services
Capabilities: [1d0] Secondary PCI Express
Kernel driver in use: i40e
Kernel modules: i40e
root@pve3 ➜ ~ cat /etc/network/interfaces
# network interface settings; autogenerated
# Please do NOT modify this file directly, unless you know what
# you're doing.
#
# If you want to manage parts of the network configuration manually,
# please utilize the 'source' or 'source-directory' directives to do
# so.
# PVE will preserve these directives, but will NOT read its network
# configuration from sourced files, so do not attempt to move any of
# the PVE managed interfaces into external files!
auto lo
iface lo inet loopback
iface eno1 inet manual # aka enp1s0f0
auto eno2
iface eno2 inet static
address 192.168.5.103/24
auto #
iface # inet manual
iface eno1 inet manual
iface eno3 inet manual
iface eno4 inet manual
iface eno3 inet manual # aka enp8s0f0
iface eno4 inet manual # aka enp8s0f1
auto vmbr0
iface vmbr0 inet static
address 192.168.1.5/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094