No Connection if run with Kernel version >= 5.15.35

icodetea

New Member
Jul 3, 2022
7
0
1
since kernel update to 5.15.XX I run into problem that my proxmox after some moments terminates all connections:
no container, vm or proxmox itself is reachable.
if I login on machine and try solve it, even ping 8.8.8.8 does not work.
I already looked into /etc/network/interfaces:
Bash:
root@pve:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto vmbr0
iface vmbr0 inet static
        address 192.168.1.2/24
        gateway 192.168.1.1
        bridge-ports enp1s0
        bridge-stp off
        bridge-fd 0
so, nothing suspcious here, name of my eth is "enp1s0"...

I also already looked up if it was renamed: result no it was not:
Bash:
root@pve:~# lshw | grep -B 5 enp
                description: Ethernet interface
                product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
                vendor: Realtek Semiconductor Co., Ltd.
                physical id: 0
                bus info: pci@0000:01:00.0
                logical name: enp1s0
name matches
I attached syslog, and output if "ip address", "ip link" and "ip route".

Crazy thing is: if i reboot, select advanced boot options and boot with older kernel version (5.13.19-15) then connection works fine again.

So I cannot update my proxmox kernel anymore... any idea whats going on?
 

Attachments

  • out_ipAddress.txt
    3.9 KB · Views: 3
  • out_ipLink.txt
    3.7 KB · Views: 1
  • out_iproute.txt
    143 bytes · Views: 1
  • syslog.txt
    174.7 KB · Views: 10
Ok, thx done, but how I can check if there a Fix for it? just test upcomnig versions?
 
Today I tried with most recen kernel version: 5.15.74 still same problem I am 4 Months behind with Kernel verison. Can someone find out how I can reanimate my NIC Controller to works with most recent updates? or at least how to track if there is a fix for?
 

Attachments

  • syslog.txt
    218.8 KB · Views: 0
OK i found the interesting part in the Log,

here is similar problem: https://forum.proxmox.com/threads/n...88179_178a-transmit-queue-0-timed-out.117729/
I also tried the workaround here: (ethtook -K <interface> tso off gso off)
https://forum.proxmox.com/threads/e1000-driver-hang.58284/#post-279338
but no luck.
would be happy for any hints...


Code:
ov 23 17:52:38 pve kernel: ------------[ cut here ]------------
Nov 23 17:52:38 pve kernel: NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out
Nov 23 17:52:38 pve kernel: WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x277/0x280
Nov 23 17:52:38 pve kernel: Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel nf_conntrack_netlink xt_state tcp_diag inet_diag iptable_nat xt_nat nft_chain_nat xt_MASQUERADE nf_nat xfrm_user xfrm_algo nft_counter nft_compat overlay binfmt_misc veth ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_set xt_physdev xt_addrtype xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment xt_tcpudp xt_mark iptable_filter bpfilter ip_set_hash_net ip_set nf_tables softdog bonding tls nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal snd_hda_codec_hdmi intel_powerclamp mei_hdcp coretemp rtl8723be btcoexist rtl8723_common snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic i915 ledtrig_audio rtl_pci snd_hda_intel snd_intel_dspcfg rtlwifi kvm_intel btusb ttm uvcvideo
Nov 23 17:52:38 pve kernel:  btrtl videobuf2_vmalloc btbcm videobuf2_memops videobuf2_v4l2 kvm btintel videobuf2_common snd_intel_sdw_acpi mac80211 irqbypass drm_kms_helper videodev snd_hda_codec crct10dif_pclmul ghash_clmulni_intel bluetooth aesni_intel crypto_simd cryptd rapl hp_wmi snd_hda_core platform_profile joydev pcspkr mc intel_cstate sparse_keymap input_leds wmi_bmof ecdh_generic snd_hwdep cec rc_core ecc serio_raw snd_pcm cfg80211 efi_pstore snd_timer mei_me i2c_algo_bit fb_sys_fops syscopyarea sysfillrect snd at24 soundcore sysimgblt libarc4 mei mac_hid hp_accel lis3lv02d wireless_hotkey acpi_pad zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress raid6_pq simplefb dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c rtsx_pci_sdmmc xhci_pci
Nov 23 17:52:38 pve kernel:  crc32_pclmul rtsx_pci psmouse ahci ehci_pci i2c_i801 libahci ehci_hcd i2c_smbus lpc_ich r8169 xhci_pci_renesas realtek xhci_hcd wmi video
Nov 23 17:52:38 pve kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: P           O      5.15.74-1-pve #1
Nov 23 17:52:38 pve kernel: Hardware name: Hewlett-Packard HP 350 G2/803A, BIOS F.13 06/10/2015
Nov 23 17:52:38 pve kernel: RIP: 0010:dev_watchdog+0x277/0x280
Nov 23 17:52:38 pve kernel: Code: eb 97 48 8b 5d d0 c6 05 e5 a7 4d 01 01 48 89 df e8 0e 55 f9 ff 44 89 e1 48 89 de 48 c7 c7 10 9a ea 87 48 89 c2 e8 76 90 1c 00 <0f> 0b eb 80 e9 9a bf 25 00 0f 1f 44 00 00 55 49 89 ca 48 89 e5 41
Nov 23 17:52:38 pve kernel: RSP: 0018:ffffb054c0168e70 EFLAGS: 00010282
Nov 23 17:52:38 pve kernel: RAX: 0000000000000000 RBX: ffff8e8d8e07c000 RCX: 0000000000000000
Nov 23 17:52:38 pve kernel: RDX: ffff8e8fd1d2c240 RSI: ffff8e8fd1d20580 RDI: 0000000000000300
Nov 23 17:52:38 pve kernel: RBP: ffffb054c0168ea8 R08: 0000000000000003 R09: 0000000000000001
Nov 23 17:52:38 pve kernel: R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
Nov 23 17:52:38 pve kernel: R13: ffff8e8d801fb880 R14: 0000000000000001 R15: ffff8e8d8e07c4c0
Nov 23 17:52:38 pve kernel: FS:  0000000000000000(0000) GS:ffff8e8fd1d00000(0000) knlGS:0000000000000000
Nov 23 17:52:38 pve kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 23 17:52:38 pve kernel: CR2: 00007fe42f71e000 CR3: 00000002bb210006 CR4: 00000000003706e0
Nov 23 17:52:38 pve kernel: Call Trace:
Nov 23 17:52:38 pve kernel:  <IRQ>
Nov 23 17:52:38 pve kernel:  ? pfifo_fast_enqueue+0x160/0x160
Nov 23 17:52:38 pve kernel:  call_timer_fn+0x2b/0x120
Nov 23 17:52:38 pve kernel:  __run_timers.part.0+0x1e1/0x270
Nov 23 17:52:38 pve kernel:  ? ktime_get+0x46/0xc0
Nov 23 17:52:38 pve kernel:  ? native_x2apic_icr_read+0x20/0x20
Nov 23 17:52:38 pve kernel:  ? lapic_next_event+0x21/0x30
Nov 23 17:52:38 pve kernel:  ? clockevents_program_event+0xab/0x130
Nov 23 17:52:38 pve kernel:  run_timer_softirq+0x2a/0x60
Nov 23 17:52:38 pve kernel:  __do_softirq+0xd9/0x2ea
Nov 23 17:52:38 pve kernel:  irq_exit_rcu+0x94/0xc0
Nov 23 17:52:38 pve kernel:  sysvec_apic_timer_interrupt+0x80/0x90
Nov 23 17:52:38 pve kernel:  </IRQ>
Nov 23 17:52:38 pve kernel:  <TASK>
Nov 23 17:52:38 pve kernel:  asm_sysvec_apic_timer_interrupt+0x1b/0x20
Nov 23 17:52:38 pve kernel: RIP: 0010:cpuidle_enter_state+0xd9/0x620
Nov 23 17:52:38 pve kernel: Code: 3d 64 64 df 78 e8 27 24 6e ff 49 89 c7 0f 1f 44 00 00 31 ff e8 68 31 6e ff 80 7d d0 00 0f 85 5e 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 6a 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e5 03 00 00
Nov 23 17:52:38 pve kernel: RSP: 0018:ffffb054c00efe38 EFLAGS: 00000246
Nov 23 17:52:38 pve kernel: RAX: ffff8e8fd1d30bc0 RBX: ffffd054bfd00000 RCX: 0000000000000000
Nov 23 17:52:38 pve kernel: RDX: 0000000000000006 RSI: 000000003a510811 RDI: 0000000000000000
Nov 23 17:52:38 pve kernel: RBP: ffffb054c00efe88 R08: 0000000ac0caf5c9 R09: 0000000000000000
Nov 23 17:52:38 pve kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff886d4280
Nov 23 17:52:38 pve kernel: R13: 0000000000000008 R14: 0000000000000008 R15: 0000000ac0caf5c9
Nov 23 17:52:38 pve kernel:  ? cpuidle_enter_state+0xc8/0x620
Nov 23 17:52:38 pve kernel:  cpuidle_enter+0x2e/0x50
Nov 23 17:52:38 pve kernel:  do_idle+0x20d/0x2b0
Nov 23 17:52:38 pve kernel:  cpu_startup_entry+0x20/0x30
Nov 23 17:52:38 pve kernel:  start_secondary+0x12a/0x180
Nov 23 17:52:38 pve kernel:  secondary_startup_64_no_verify+0xc2/0xcb
Nov 23 17:52:38 pve kernel:  </TASK>
Nov 23 17:52:38 pve kernel: ---[ end trace 2cf9f2d86c92dcd4 ]---
 
ok now i tried a workaround: I buyed a new usb ethernet adapter and installed it on proxmox:
I allocated him to another ip address and crated a new interface for him so my pve GUI is reachable now from 2 ip addresses.
it works again, fine with kenel version 5.13.19-6.
if i start it on newer kernel (5.15.74) my old problem still there, and when i try to reach proxmox usin the new USb Ethernet adapter, on browser I get no response, in syslog I see:
Code:
Dec 01 20:27:42 pve kernel: INFO: task kworker/3:2:199 blocked for more than 241 seconds.
Dec 01 20:27:42 pve kernel:       Tainted: P        W  O      5.15.74-1-pve #1
Dec 01 20:27:42 pve kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 01 20:27:42 pve kernel: task:kworker/3:2     state:D stack:    0 pid:  199 ppid:     2 flags:0x00004000
Dec 01 20:27:42 pve kernel: Workqueue: events rtl_task [r8169]
Dec 01 20:27:42 pve kernel: Call Trace:
Dec 01 20:27:42 pve kernel:  <TASK>
Dec 01 20:27:42 pve kernel:  __schedule+0x34e/0x1740
Dec 01 20:27:42 pve kernel:  ? hrtimer_reprogram+0x52/0xb0
Dec 01 20:27:42 pve kernel:  ? schedule+0x85/0x110
Dec 01 20:27:42 pve kernel:  ? schedule_hrtimeout_range_clock+0xa3/0x130
Dec 01 20:27:42 pve kernel:  schedule+0x69/0x110
Dec 01 20:27:42 pve kernel:  schedule_preempt_disabled+0xe/0x20
Dec 01 20:27:42 pve kernel:  __mutex_lock.constprop.0+0x255/0x480
Dec 01 20:27:42 pve kernel:  ? net_ratelimit+0x1c/0x30
Dec 01 20:27:42 pve kernel:  ? rtl_hw_start_8411_2+0x940/0x940 [r8169]
Dec 01 20:27:42 pve kernel:  __mutex_lock_slowpath+0x13/0x20
Dec 01 20:27:42 pve kernel:  mutex_lock+0x38/0x50
Dec 01 20:27:42 pve kernel:  rtl_reset_work+0x17b/0x460 [r8169]
Dec 01 20:27:42 pve kernel:  rtl_task+0x4d/0x70 [r8169]
Dec 01 20:27:42 pve kernel:  process_one_work+0x22b/0x3d0
Dec 01 20:27:42 pve kernel:  worker_thread+0x53/0x420
Dec 01 20:27:42 pve kernel:  ? process_one_work+0x3d0/0x3d0
Dec 01 20:27:42 pve kernel:  kthread+0x12a/0x150
Dec 01 20:27:42 pve kernel:  ? set_kthread_struct+0x50/0x50
Dec 01 20:27:42 pve kernel:  ret_from_fork+0x22/0x30
Dec 01 20:27:42 pve kernel:  </TASK>

serious guys, I know it probably again kernel Problems but I need help or I need to throw proxmox away....
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!