Network Device: Detected Tx Unit Hang - what could be the underlying issue?

Iacov

Member
Jan 24, 2024
40
0
6
hey

i've encountered a weird problem with my network device today

i have one PVE server which usually works really reliable, except for an issue last saturay, which i count as heat related and today i suddenly lost connection to my VMs

a reboot did help, but the log showed following issue:
Code:
Jul 01 05:37:48 pve1 kernel: igc 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000b address=0xcf79b2c0 flags=0x0000]
Jul 01 05:37:50 pve1 kernel: igc 0000:01:00.0 eno1: Detected Tx Unit Hang
  Tx Queue             <1>
  TDH                  <37>
  TDT                  <51>
  next_to_use          <51>
  next_to_clean        <37>
buffer_info[next_to_clean]
  time_stamp           <10f1c6564>
  next_to_watch        <000000008856c63f>
  jiffies              <10f1c6b40>
  desc.status          <14e8200>
Jul 01 05:37:50 pve1 kernel: igc 0000:01:00.0 eno1: Detected Tx Unit Hang
  Tx Queue             <3>
  TDH                  <8b>
  TDT                  <bc>
  next_to_use          <bc>
  next_to_clean        <8b>
buffer_info[next_to_clean]
  time_stamp           <10f1c6559>
  next_to_watch        <00000000f73cbd08>
  jiffies              <10f1c6b40>
  desc.status          <d8000>
(and there's a repeating sequence of entries, only varying in values for Tx Queue etc)

eno1 is my network device

what could be the issue with my network configuration or hardware?
just a "freak" one-time driver issue or could it hint at a larger issue that i should investigate further?

thanks for your advice!

edit: hardware is a minisforum um 560 with a ryzen 5 5625u...can't find the network card specs at the moment, will edit as soon as i find it.
device runs with proxmox since 2023 and this is the first time this issue occured.
pve version: 8.4.1
 
Last edited:
can't find the network card specs
Use lspci -vnnk | awk '/Ethernet/{print $0}' RS= and ethtool eno1 andethtool -i eno1 to get some information about it.
You might have to install ethtool via apt install ethtool first.
 
Last edited:
Use lspci -vnnk | awk '/Ethernet/{print $0}' RS= and ethtool eno1 andethtool -i eno1 to get some information about it.
You might have to install ethtool via apt install ethtool first.
thank you

Code:
01:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
        DeviceName: Onboard LAN Brodcom
        Subsystem: Intel Corporation Ethernet Controller I225-V [8086:0000]
        Flags: bus master, fast devsel, latency 0, IRQ 24, IOMMU group 10
        Memory at fcc00000 (32-bit, non-prefetchable) [size=1M]
        Memory at fcd00000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
        Capabilities: [a0] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Device Serial Number redacted
        Capabilities: [1c0] Latency Tolerance Reporting
        Capabilities: [1f0] Precision Time Measurement
        Capabilities: [1e0] L1 PM Substates
        Kernel driver in use: igc
        Kernel modules: igc
does this controller have known issues that could explain the issue?

are there any logs that I could dig through that could offer more insight in what happened than the syslog?