Random USB NIC Disconnects (TP-LINK UE306 AX88179)

dblanque

Active Member
May 15, 2019
2
2
43
26
Hey!
So I'm really out of luck with this and have to ask for help because I might go crazy.

We've recently had to add a couple of USB NICs to our cluster to be able to migrate our OPNSense Firewall VM, as we don't have managed switches or the budget for more dedicated network hardware, and have been getting random disconnects on them, more frequently on the node that does not host our live vm firewall.

We have a triple Public IPv4 WAN setup (One is PPPoE, one is Static IP assigned within DMZ (Not a USB NIC), one is MAC Linked Static DHCP Lease), each connected to a small switch to allow for the FW to be migrated (this works very well and we've had no issues with it!).

Anyways, the random USB NIC Disconnects seem to be prominent on:
  • "Unused" NICs (i.e.: The host that isn't currently hosting the OPNSense VM randomly disconnects one or the other USB NIC, then after a while -maybe- both USB NICs go down).
  • When the WAN DHCP Lease Expires (Not confirmed but an assumption, rather).
Both of the WANs that have actual static IPs (PPPoE and Private Subnet DMZ) don't seem to experience these issues.

Other things to have in mind:
  • IOMMU is effectively enabled on both nodes (have checked with a bash script/command found on this forum).
  • Both nodes are AMD Ryzen 2600, same MOBO (Can't recall the Model but can add it if necessary)
  • The BIOS/UEFI for both nodes has been updated about 6 months ago, so it's relatively recent.
From what I've been researching there have been many issues with the drivers for this USB NIC in the past and was wondering if there was some sort of regression in the Kernel or something along those lines.

I'll leave all the data I've gathered up to now.

Threads and solutions I've explored:

History - USB Auto-suspend disable:
Code:
  494  echo -1 | sudo tee /sys/bus/usb/devices/*/power/autosuspend >/dev/null
  495  echo on | sudo tee /sys/bus/usb/devices/*/power/level >/dev/null
  496  nano /etc/default/grub
  497  sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="/&usbcore.autosuspend=-1 /' /etc/default/grub
  498  update-grub

/etc/network/interfaces snippet (only the relevant parts)

Code:
auto enx7cc2c649a930
iface enx7cc2c649a930 inet manual
        post-up /sbin/ethtool -offload enx7cc2c649a930 tx off rx off; /sbin/ethtool -K enx7cc2c649a930 gso off; /sbin/ethtool -K enx7cc2c649a930 tso off;
#TPLINK USB BOTTOM

auto enx7cc2c64b9069
iface enx7cc2c64b9069 inet manual
        post-up /sbin/ethtool -offload enx7cc2c64b9069 tx off rx off; /sbin/ethtool -K enx7cc2c64b9069 gso off; /sbin/ethtool -K enx7cc2c64b9069 tso off;
#TPLINK USB TOP

auto vmbr11
iface vmbr11 inet manual
        bridge-ports enx7cc2c64b9069
        bridge-stp off
        bridge-fd 0
#WAN - TELECOM

auto vmbr12
iface vmbr12 inet manual
        bridge-ports enx7cc2c649a930
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
#WAN - MOVISTAR

lsusb

Bash:
root@pve02:~# lsusb -tv
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
    |__ Port 3: Dev 6, If 0, Class=Vendor Specific Class, Driver=ax88179_178a, 5000M
        ID 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
    |__ Port 4: Dev 5, If 0, Class=Vendor Specific Class, Driver=ax88179_178a, 5000M
        ID 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/3p, 10000M
    ID 1d6b:0003 Linux Foundation 3.0 root hub
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/9p, 480M
    ID 1d6b:0002 Linux Foundation 2.0 root hub
    |__ Port 4: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
        ID 04f3:0103 Elan Microelectronics Corp. ActiveJet K-2024 Multimedia Keyboard
    |__ Port 4: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        ID 04f3:0103 Elan Microelectronics Corp. ActiveJet K-2024 Multimedia Keyboard
    |__ Port 5: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
        ID 0458:003a KYE Systems Corp. (Mouse Systems) NetScroll+ Mini Traveler / Genius NetScroll 120

journalctl -e

Bash:
Mar 08 19:19:40 pve01 kernel: xhci_hcd 0000:08:00.3: WARN: HC couldn't access mem fast enough for slot 1 ep 2
Mar 08 19:21:22 pve01 pmxcfs[2204]: [dcdb] notice: data verification successful
Mar 08 19:21:23 pve01 smartd[1932]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 50 to 52
Mar 08 19:21:26 pve01 smartd[1932]: Device: /dev/sdc [SAT], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 80 to 81
Mar 08 19:21:26 pve01 smartd[1932]: Device: /dev/sdc [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 80 to 81
Mar 08 19:25:01 pve01 CRON[2011272]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Mar 08 19:25:01 pve01 CRON[2011273]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Mar 08 19:25:01 pve01 CRON[2011272]: pam_unix(cron:session): session closed for user root
Mar 08 19:28:15 pve01 pmxcfs[2204]: [status] notice: received log
Mar 08 19:28:16 pve01 pmxcfs[2204]: [status] notice: received log
Mar 08 19:28:52 pve01 kernel: xhci_hcd 0000:08:00.3: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069: unregister 'ax88179_178a' usb-0000:08:00.3-4, ASIX AX88179 USB 3.0 Gigabit Ethernet
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069: Failed to read reg index 0x0002: -19
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069: Failed to write reg index 0x0002: -19
Mar 08 19:28:52 pve01 kernel: vmbr11: port 1(enx7cc2c64b9069) entered disabled state
Mar 08 19:28:52 pve01 kernel: device enx7cc2c64b9069 left promiscuous mode
Mar 08 19:28:52 pve01 kernel: vmbr11: port 1(enx7cc2c64b9069) entered disabled state
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069 (unregistered): Failed to write reg index 0x0002: -19
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069 (unregistered): Failed to write reg index 0x0001: -19
Mar 08 19:28:52 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069 (unregistered): Failed to write reg index 0x0002: -19
Mar 08 19:28:53 pve01 kernel: usb 4-4: reset SuperSpeed USB device number 7 using xhci_hcd
Mar 08 19:28:53 pve01 kernel: ax88179_178a 4-4:1.0 eth0: register 'ax88179_178a' at usb-0000:08:00.3-4, ASIX AX88179 USB 3.0 Gigabit Ethernet, 7c:c2:c6:4b:90:69
Mar 08 19:28:53 pve01 kernel: ax88179_178a 4-4:1.0 enx7cc2c64b9069: renamed from eth0

As you can probably see in the logs, vmbr11 never goes into forwarding state after the USB NIC disconnects and reconnects.
The same issue occurs with the other device (which is on vmbr12)

pveversion -v

Code:
proxmox-ve: 8.1.0 (running kernel: 6.2.16-19-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
pve-kernel-5.15: 7.4-4
proxmox-kernel-6.5.13-1-pve-signed: 6.5.13-1
proxmox-kernel-6.5: 6.5.13-1
pve-kernel-5.4: 6.4-7
proxmox-kernel-6.2.16-19-pve: 6.2.16-19
proxmox-kernel-6.2: 6.2.16-19
pve-kernel-6.2.16-5-pve: 6.2.16-6
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph-fuse: 16.2.11+ds-2
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx5
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.9
libpve-guest-common-perl: 5.0.5
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.4-1
proxmox-backup-file-restore: 3.0.4-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.4
pve-container: 5.0.5
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-3
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-7
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1

In the meantime if this keeps happening I might resort to making a small python watchdog to do ifreload -a whenever this happens...

If anyone has ideas or has been struck by this issue, I'd really appreciate some help!
Regards,
Dylan
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!