Proxmox on NUC8i5BEH Kernel error

d1nd141

New Member
Feb 3, 2020
4
0
1
49
Hello,
i already saw some posts here regarding the NIC card and "ethtool -K eno1 tso off gso off "
Unfortunately i have another error regarding CPU:


Code:
[Mon Feb  3 00:59:05 2020] e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
                             TDH                  <18>
                             TDT                  <53>
                             next_to_use          <53>
                             next_to_clean        <17>
                           buffer_info[next_to_clean]:
                             time_stamp           <10074c090>
                             next_to_watch        <18>
                             jiffies              <10074c750>
                             next_to_watch.status <0>
                           MAC Status             <40080083>
                           PHY Status             <796d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
[Mon Feb  3 00:59:06 2020] ------------[ cut here ]------------
[Mon Feb  3 00:59:06 2020] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
[Mon Feb  3 00:59:06 2020] WARNING: CPU: 4 PID: 0 at net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270
[Mon Feb  3 00:59:06 2020] Modules linked in: veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel zfs(PO) aesni_intel zunicode(PO) aes_x86_64 zlua(PO) crypto_simd zavl(PO) cryptd glue_helper icp(PO) intel_cstate mei_hdcp i915 drm_kms_helper intel_rapl_perf drm rtsx_pci_ms mei_me i2c_algo_bit fb_sys_fops memstick syscopyarea sysfillrect sysimgblt wmi_bmof mei pcspkr intel_wmi_thunderbolt intel_pch_thermal mac_hid acpi_pad acpi_tad zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c rtsx_pci_sdmmc e1000e i2c_i801 ahci rtsx_pci libahci wmi
[Mon Feb  3 00:59:06 2020]  pinctrl_cannonlake video pinctrl_intel
[Mon Feb  3 00:59:06 2020] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P           O      5.3.13-1-pve #1
[Mon Feb  3 00:59:06 2020] Hardware name: Intel(R) Client Systems NUC8i5BEH/NUC8BEB, BIOS BECFL357.86A.0073.2019.0618.1409 06/18/2019
[Mon Feb  3 00:59:06 2020] RIP: 0010:dev_watchdog+0x264/0x270
[Mon Feb  3 00:59:06 2020] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 f2 6f ea 00 01 e8 10 ed fa ff 89 d9 4c 89 ee 48 c7 c7 88 cc 61 a9 48 89 c2 e8 0d 3a 73 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
[Mon Feb  3 00:59:06 2020] RSP: 0018:ffff9d63001e4e58 EFLAGS: 00010282
[Mon Feb  3 00:59:06 2020] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[Mon Feb  3 00:59:06 2020] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8f1c6eb17440
[Mon Feb  3 00:59:06 2020] RBP: ffff9d63001e4e88 R08: 0000000000000368 R09: 0000000000000004
[Mon Feb  3 00:59:06 2020] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
[Mon Feb  3 00:59:06 2020] R13: ffff8f1c61e04000 R14: ffff8f1c61e04480 R15: ffff8f1c6bc9f480
[Mon Feb  3 00:59:06 2020] FS:  0000000000000000(0000) GS:ffff8f1c6eb00000(0000) knlGS:0000000000000000
[Mon Feb  3 00:59:06 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Feb  3 00:59:06 2020] CR2: 0000122a17cb4000 CR3: 0000000463b04001 CR4: 00000000003626e0
[Mon Feb  3 00:59:06 2020] Call Trace:
[Mon Feb  3 00:59:06 2020]  <IRQ>
[Mon Feb  3 00:59:06 2020]  ? pfifo_fast_enqueue+0x160/0x160
[Mon Feb  3 00:59:06 2020]  call_timer_fn+0x32/0x130
[Mon Feb  3 00:59:06 2020]  run_timer_softirq+0x19d/0x420
[Mon Feb  3 00:59:06 2020]  ? enqueue_hrtimer+0x3c/0x90
[Mon Feb  3 00:59:06 2020]  ? ktime_get+0x40/0xa0
[Mon Feb  3 00:59:06 2020]  ? lapic_next_deadline+0x26/0x30
[Mon Feb  3 00:59:06 2020]  ? clockevents_program_event+0x93/0xf0
[Mon Feb  3 00:59:06 2020]  __do_softirq+0xdc/0x2d4
[Mon Feb  3 00:59:06 2020]  irq_exit+0xa9/0xb0
[Mon Feb  3 00:59:06 2020]  smp_apic_timer_interrupt+0x79/0x130
[Mon Feb  3 00:59:06 2020]  apic_timer_interrupt+0xf/0x20
[Mon Feb  3 00:59:06 2020]  </IRQ>
[Mon Feb  3 00:59:06 2020] RIP: 0010:cpuidle_enter_state+0xbd/0x450
[Mon Feb  3 00:59:06 2020] Code: ff e8 87 ea 82 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff e8 6a 51 89 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 89 cf 01 00 00 41 c7 44 24 10 00 00 00 00 48 83 c4 18
[Mon Feb  3 00:59:06 2020] RSP: 0018:ffff9d63000f7e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[Mon Feb  3 00:59:06 2020] RAX: ffff8f1c6eb2a740 RBX: ffffffffa9958900 RCX: 000000000000001f
[Mon Feb  3 00:59:06 2020] RDX: 00001c1d954ff265 RSI: 00000000378e3a78 RDI: 0000000000000000
[Mon Feb  3 00:59:06 2020] RBP: ffff9d63000f7e88 R08: 0000000000000002 R09: 0000000000029fc0
[Mon Feb  3 00:59:06 2020] R10: 000040cf19103c98 R11: ffff8f1c6eb294c4 R12: ffff8f1c6eb35500
[Mon Feb  3 00:59:06 2020] R13: 0000000000000001 R14: ffffffffa9958978 R15: ffffffffa9958960
[Mon Feb  3 00:59:06 2020]  ? cpuidle_enter_state+0x99/0x450
[Mon Feb  3 00:59:06 2020]  cpuidle_enter+0x2e/0x40
[Mon Feb  3 00:59:06 2020]  call_cpuidle+0x23/0x40
[Mon Feb  3 00:59:06 2020]  do_idle+0x22c/0x270
[Mon Feb  3 00:59:06 2020]  cpu_startup_entry+0x1d/0x20
[Mon Feb  3 00:59:06 2020]  start_secondary+0x167/0x1c0
[Mon Feb  3 00:59:06 2020]  secondary_startup_64+0xa4/0xb0
[Mon Feb  3 00:59:06 2020] ---[ end trace b04ccbce9f9e1380 ]---
[Mon Feb  3 00:59:06 2020] e1000e 0000:00:1f.6 eno1: Reset adapter unexpectedly
[Mon Feb  3 00:59:06 2020] vmbr0: port 1(eno1) entered disabled state
[Mon Feb  3 00:59:13 2020] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[Mon Feb  3 00:59:13 2020] vmbr0: port 1(eno1) entered blocking state
[Mon Feb  3 00:59:13 2020] vmbr0: port 1(eno1) entered forwarding state
[Mon Feb  3 03:45:53 2020] perf: interrupt took too long (4958 > 4956), lowering kernel.perf_event_max_sample_rate to 40250


Maybe related to the NIC error. Maybe another bug.
The CPU usage is about 2-4% for 8 cpu's. over the last day.
 
hi,

from the output it seems likely to be related with the e1000 error

did you try the workarounds posted in the forum?
 
its resolved after applying this code
ethtool -K <device name> gso off gro off tso off tx off rx off

thnx sir
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!