"pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang"

chudak

Well-Known Member
May 11, 2019
322
16
58
Hello all,

I get this error when trying to upload files via S3.
I must admit I am using 17+ threads to do so, on lower number (7 threads) I don't see this.

I would consider it a "pilot" problem but I don't those issues on much less powerful h/w with 30 threads running the same uploads.
So maybe it is configurable somewhere?

Code:
Sep 13 11:56:13 pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                              TDH                  <4f>
                              TDT                  <ae>
                              next_to_use          <ae>
                              next_to_clean        <4e>
                            buffer_info[next_to_clean]:
                              time_stamp           <100875622>
                              next_to_watch        <4f>
                              jiffies              <100875831>
                              next_to_watch.status <0>
                            MAC Status             <40080083>
                            PHY Status             <796d>
                            PHY 1000BASE-T Status  <3800>
                            PHY Extended Status    <3000>
                            PCI Status             <10>

Thx!
 
i changed NIC in h/w from VirtIO (paravirtualized) to Intel E1000 and it seemed got rid of those errors

Still interesting learning more about those errors and the right way to manage VMs NIC selection
 
E1000 is an emulated NIC and typically less efficient (more CPU overhead, less performance). So if I am considering performance and efficiency I choose Virtio.
Seems counterintuitive what you describe ;)
Did you recognise any changes in throughput etc?
 
E1000 is an emulated NIC and typically less efficient (more CPU overhead, less performance). So if I am considering performance and efficiency I choose Virtio.
Seems counterintuitive what you describe ;)
Did you recognise any changes in throughput etc?

No I have not yet.
But will measure timing etc to see the difference

in other words, you say 7 treads on Virtio maybe perform better then on E1000, good point !
 
Id' expect that you will get better performance with virtio. I have no plausible explanation for what you see, except perhaps the issue lies some place else (storage for instance) and the number of threads means higher load where something drops out.
It might just be a sympthom
 
Thanks for sharing.
Still odd that you see errors with more threads.
But as I said. May just be a symptom.
All the best
 
I actually see more of this even without uploading

Code:
Sep 14 08:14:02 pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                              TDH                  <b4>
                              TDT                  <d5>
                              next_to_use          <d5>
                              next_to_clean        <b3>
                            buffer_info[next_to_clean]:
                              time_stamp           <100547db2>
                              next_to_watch        <b4>
                              jiffies              <100547f98>
                              next_to_watch.status <0>
                            MAC Status             <40080083>
                            PHY Status             <796d>
                            PHY 1000BASE-T Status  <3800>
                            PHY Extended Status    <3000>
                            PCI Status             <10>
Sep 14 08:14:03 pve pvestatd[1236]: storage 'Backups' is not online
Sep 14 08:14:04 pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                              TDH                  <b4>
                              TDT                  <d5>
                              next_to_use          <d5>
                              next_to_clean        <b3>
                            buffer_info[next_to_clean]:
                              time_stamp           <100547db2>
                              next_to_watch        <b4>
                              jiffies              <100548188>
                              next_to_watch.status <0>
                            MAC Status             <40080083>
                            PHY Status             <796d>
                            PHY 1000BASE-T Status  <3800>
                            PHY Extended Status    <3000>
                            PCI Status             <10>
Sep 14 08:14:06 pve pvestatd[1236]: storage 'ISOs-SMB' is not online
Sep 14 08:14:06 pve kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                              TDH                  <b4>
                              TDT                  <d5>
                              next_to_use          <d5>
                              next_to_clean        <b3>
                            buffer_info[next_to_clean]:
                              time_stamp           <100547db2>
                              next_to_watch        <b4>
                              jiffies              <100548380>
                              next_to_watch.status <0>
                            MAC Status             <40080083>
                            PHY Status             <796d>
                            PHY 1000BASE-T Status  <3800>
                            PHY Extended Status    <3000>
                            PCI Status             <10>
Sep 14 08:14:07 pve kernel: ------------[ cut here ]------------
Sep 14 08:14:07 pve kernel: NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
Sep 14 08:14:07 pve kernel: WARNING: CPU: 7 PID: 0 at net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270
Sep 14 08:14:07 pve kernel: Modules linked in: tcp_diag(E) inet_diag(E) veth(E) md4(E) cmac(E) nls_utf8(E) cifs(E) fscache(E) libdes(E) ebtable_filter(E) ebtables(E) ip_set(E) ip6table_raw(E) iptable_raw(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) bpfilter(E) softdog(E) nfnetlink_log(E) nfnetlink(E) zfs(POE) zunicode(POE) zlua(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) snd_hda_codec_hdmi(E) intel_rapl_msr(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) kvm_intel(E) kvm(E) snd_sof_pci(E) snd_sof_intel_hda_common(E) snd_soc_hdac_hda(E) snd_sof_intel_hda(E) snd_sof_intel_byt(E) snd_sof_intel_ipc(E) snd_sof(E) snd_sof_xtensa_dsp(E) snd_hda_ext_core(E) snd_soc_acpi_intel_match(E) snd_soc_acpi(E) ledtrig_audio(E) snd_soc_core(E) snd_compress(E) ac97_bus(E) snd_pcm_dmaengine(E) snd_hda_intel(E) iwlmvm(E) crct10dif_pclmul(E) i915(E) snd_intel_dspcfg(E) crc32_pclmul(E) snd_hda_codec(E) ghash_clmulni_intel(E)
Sep 14 08:14:07 pve kernel:  mac80211(E) btusb(E) drm_kms_helper(E) snd_hda_core(E) libarc4(E) btrtl(E) snd_hwdep(E) btbcm(E) btintel(E) drm(E) snd_pcm(E) aesni_intel(E) iwlwifi(E) bluetooth(E) crypto_simd(E) i2c_algo_bit(E) mei_hdcp(E) snd_timer(E) cryptd(E) fb_sys_fops(E) tps6598x(E) ecdh_generic(E) typec(E) snd(E) syscopyarea(E) glue_helper(E) ecc(E) sysfillrect(E) joydev(E) intel_cstate(E) mei_me(E) soundcore(E) sysimgblt(E) input_leds(E) pcspkr(E) mei(E) wmi_bmof(E) cfg80211(E) intel_wmi_thunderbolt(E) i2c_multi_instantiate(E) mac_hid(E) acpi_pad(E) acpi_tad(E) vhost_net(E) vhost(E) tap(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) sunrpc(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) ip_tables(E) x_tables(E) autofs4(E) btrfs(E) xor(E) zstd_compress(E) raid6_pq(E) dm_thin_pool(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) libcrc32c(E) hid_generic(E) usbkbd(E) usbmouse(E) usbhid(E) hid(E) sdhci_pci(E) e1000e(E) cqhci(E) i2c_i801(E) sdhci(E) thunderbolt(E) ahci(E)
Sep 14 08:14:07 pve kernel:  intel_lpss_pci(E) intel_lpss(E) libahci(E) idma64(E) virt_dma(E) wmi(E) pinctrl_cannonlake(E) video(E) pinctrl_intel(E)
Sep 14 08:14:07 pve kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: P           OE     5.4.34-1-pve #1
Sep 14 08:14:07 pve kernel: Hardware name: Intel(R) Client Systems NUC10i7FNH/NUC10i7FNB, BIOS FNCML357.0044.2020.0715.1813 07/15/2020
Sep 14 08:14:07 pve kernel: RIP: 0010:dev_watchdog+0x264/0x270
Sep 14 08:14:07 pve kernel: Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 d5 0c e7 00 01 e8 b0 b1 fa ff 89 d9 4c 89 ee 48 c7 c7 40 89 a4 ac 48 89 c2 e8 9d 2b 70 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
Sep 14 08:14:07 pve kernel: RSP: 0018:ffffa66e002c0e58 EFLAGS: 00010282
Sep 14 08:14:07 pve kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
Sep 14 08:14:07 pve kernel: RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff942a40bd78c0
Sep 14 08:14:07 pve kernel: RBP: ffffa66e002c0e88 R08: 00000000000004c3 R09: ffffffffad1b0c1c
Sep 14 08:14:07 pve kernel: R10: 0000000000000774 R11: ffffa66e002c0cb0 R12: 0000000000000001
Sep 14 08:14:07 pve kernel: R13: ffff942a2d5bc000 R14: ffff942a2d5bc480 R15: ffff942a2f99b280
Sep 14 08:14:07 pve kernel: FS:  0000000000000000(0000) GS:ffff942a40bc0000(0000) knlGS:0000000000000000
Sep 14 08:14:07 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 08:14:07 pve kernel: CR2: 00000000011a03c0 CR3: 000000101e994005 CR4: 00000000003626e0
Sep 14 08:14:07 pve kernel: Call Trace:
Sep 14 08:14:07 pve kernel:  <IRQ>
Sep 14 08:14:07 pve kernel:  ? pfifo_fast_enqueue+0x160/0x160
Sep 14 08:14:07 pve kernel:  call_timer_fn+0x32/0x130
Sep 14 08:14:07 pve kernel:  run_timer_softirq+0x1a5/0x430
Sep 14 08:14:07 pve kernel:  ? enqueue_hrtimer+0x3c/0x90
Sep 14 08:14:07 pve kernel:  ? ktime_get+0x3c/0xa0
Sep 14 08:14:07 pve kernel:  ? lapic_next_deadline+0x26/0x30
Sep 14 08:14:07 pve kernel:  ? clockevents_program_event+0x93/0xf0
Sep 14 08:14:07 pve kernel:  __do_softirq+0xdc/0x2d4
Sep 14 08:14:07 pve kernel:  irq_exit+0xa9/0xb0
Sep 14 08:14:07 pve kernel:  smp_apic_timer_interrupt+0x79/0x130
Sep 14 08:14:07 pve kernel:  apic_timer_interrupt+0xf/0x20
Sep 14 08:14:07 pve kernel:  </IRQ>
Sep 14 08:14:07 pve kernel: RIP: 0010:cpuidle_enter_state+0xbd/0x450
Sep 14 08:14:07 pve kernel: Code: ff e8 e7 63 80 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff e8 5a df 86 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8d 02 00 00 49 63 cd 48 8b 75 d0 48 2b 75 c8 48 8d
Sep 14 08:14:07 pve kernel: RSP: 0018:ffffa66e0012fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Sep 14 08:14:07 pve kernel: RAX: ffff942a40bead40 RBX: ffffffffacd66820 RCX: 000000000000001f
Sep 14 08:14:07 pve kernel: RDX: 0000146c7090b4db RSI: 000000004f9a1c05 RDI: 0000000000000000
Sep 14 08:14:07 pve kernel: RBP: ffffa66e0012fe88 R08: 0000000000000002 R09: 000000000002a5c0
Sep 14 08:14:07 pve kernel: R10: 000020dcefd98f38 R11: ffff942a40be99e0 R12: ffff942a2fb16400
Sep 14 08:14:07 pve kernel: R13: 0000000000000002 R14: ffffffffacd668f8 R15: ffffffffacd668e0
Sep 14 08:14:07 pve kernel:  ? cpuidle_enter_state+0x99/0x450
Sep 14 08:14:07 pve kernel:  cpuidle_enter+0x2e/0x40
Sep 14 08:14:07 pve kernel:  call_cpuidle+0x23/0x40
Sep 14 08:14:07 pve kernel:  do_idle+0x22c/0x270
Sep 14 08:14:07 pve kernel:  cpu_startup_entry+0x1d/0x20
Sep 14 08:14:07 pve kernel:  start_secondary+0x166/0x1c0
Sep 14 08:14:07 pve kernel:  secondary_startup_64+0xa4/0xb0
Sep 14 08:14:07 pve kernel: ---[ end trace 3b0733ccf859f524 ]---

looks like something is not happy but what ?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!