Hypervisor Hangs on LXC container shutdown

redtex

Renowned Member
Sep 13, 2012
25
1
68
Hi all !!!

It's primarily a bug report.
I have a stable situation: when LXC container shuts down, the Proxmox host hangs, until hardware reset.
Similar behavior is repeated on different hypervisors, so it's hardware independent problem.
This problem can be reproduced after several starts and stops of containers.

pveversion --verbose
proxmox-ve: 7.2-1 (running kernel: 5.15.64-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-5.15: 7.2-13
pve-kernel-helper: 7.2-13
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-3
libpve-guest-common-perl: 4.1-4
libpve-http-server-perl: 4.1-4
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-3
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-6
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

[ 1381.120401] watchdog: BUG: soft lockup - CPU#30 stuck for 26s! [swapper/30:0]
[ 1381.121214] Modules linked in: xt_nat xt_MASQUERADE xt_tcpudp xt_mark ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security iptable_nat iptable_mangle iptable_security veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter sctp ip6_udp_tunnel udp_tunnel nf_tables nfnetlink_cttimeout bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_watchdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common isst_if_common ipmi_ssif skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ast drm_vram_helper drm_ttm_helper ttm irdma drm_kms_helper ice cec rapl rc_core i2c_algo_bit mei_me fb_sys_fops intel_cstate pcspkr efi_pstore joydev input_leds ib_uverbs syscopyarea acpi_ipmi ioatdma sysfillrect
[ 1381.121264] sysimgblt mei dca intel_pch_thermal ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 hid_generic usbkbd usbmouse usbhid hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c simplefb crc32_pclmul xhci_pci xhci_pci_renesas nvme i2c_i801 i40e ahci i2c_smbus xhci_hcd lpc_ich nvme_core libahci wmi
[ 1381.130736] CPU: 30 PID: 0 Comm: swapper/30 Tainted: P O 5.15.64-1-pve #1
[ 1381.131469] Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 3.5 05/27/2021
[ 1381.132197] RIP: 0010:netdev_pick_tx+0xe3/0x310
[ 1381.132993] Code: 66 85 c0 0f 84 53 01 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 dd 01 00 00 39 c3 0f 87 55 01 00 00 29 d8 39 c3 0f 87 4b 01 00 00 <eb> f4 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 4e 01
[ 1381.134501] RSP: 0018:ffffa268cd080618 EFLAGS: 00000297
[ 1381.135213] RAX: 000000000000001e RBX: 0000000000000000 RCX: 000000000000003c
[ 1381.135922] RDX: 0000000000000000 RSI: ffff91d0edc05e00 RDI: ffff91d08717e000
[ 1381.136658] RBP: ffffa268cd080658 R08: 0000000000000000 R09: ffffa268cd080510
[ 1381.137440] R10: ffff91d087481000 R11: ffffa268cd0809a0 R12: ffff91d08717e000
[ 1381.138161] R13: 00000000ffffffff R14: ffff91d0edc05e00 R15: ffff91d08717e000
[ 1381.138826] FS: 0000000000000000(0000) GS:ffff91e81fd80000(0000) knlGS:0000000000000000
[ 1381.139508] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1381.140177] CR2: 00007f77c2899b60 CR3: 000000271a410005 CR4: 00000000007726e0
[ 1381.140892] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1381.141748] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1381.142596] PKRU: 55555554
[ 1381.143296] Call Trace:
[ 1381.143936] <IRQ>
[ 1381.144616] netdev_core_pick_tx+0xa4/0xb0
[ 1381.145286] __dev_queue_xmit+0x1b8/0xb30
[ 1381.145921] ? netif_rx_internal+0x3a/0x100
[ 1381.146527] dev_queue_xmit+0x10/0x20
[ 1381.147115] ovs_vport_send+0xab/0x170 [openvswitch]
[ 1381.147727] do_output+0x59/0x180 [openvswitch]
[ 1381.148324] do_execute_actions+0x1841/0x1b40 [openvswitch]
[ 1381.148960] ? __wake_up_common+0x7b/0x140
[ 1381.149617] ? __skb_flow_dissect+0x2f2/0x1920
[ 1381.150165] ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
[ 1381.150729] ovs_execute_actions+0x48/0x110 [openvswitch]
[ 1381.151268] ? ovs_execute_actions+0x48/0x110 [openvswitch]
[ 1381.151818] ovs_dp_process_packet+0xa1/0x1f0 [openvswitch]
[ 1381.152354] ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
[ 1381.152941] ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
[ 1381.153531] ? ovs_flow_key_extract+0x2da/0x350 [openvswitch]
[ 1381.154049] ovs_vport_receive+0x77/0xd0 [openvswitch]
[ 1381.154560] ? __fib_validate_source+0x245/0x4b0
[ 1381.155038] ? kmem_cache_alloc+0x1ab/0x2f0
[ 1381.155525] ? cpumask_next_and+0x24/0x30
[ 1381.155974] ? update_sd_lb_stats.constprop.0+0x130/0xca0
[ 1381.156439] netdev_frame_hook+0xdf/0x1b0 [openvswitch]
[ 1381.156936] ? netdev_create+0x40/0x40 [openvswitch]
[ 1381.157401] __netif_receive_skb_core+0x236/0xef0
[ 1381.157871] __netif_receive_skb_list_core+0x107/0x260
[ 1381.158283] netif_receive_skb_list_internal+0x1a1/0x2c0
[ 1381.158713] ? dev_gro_receive+0x2d6/0x7b0
[ 1381.159080] ? kmem_cache_alloc+0x1ab/0x2f0
[ 1381.159434] ? __build_skb+0x26/0x60
[ 1381.159852] napi_complete_done+0x7a/0x1c0
[ 1381.160230] i40e_napi_poll+0xc4a/0x1340 [i40e]
[ 1381.160644] __napi_poll+0x30/0x180
[ 1381.161004] net_rx_action+0x126/0x280
[ 1381.161352] __do_softirq+0xd6/0x2ea
[ 1381.161726] irq_exit_rcu+0x94/0xc0
[ 1381.162027] common_interrupt+0x8e/0xa0
[ 1381.162327] </IRQ>
[ 1381.162652] <TASK>
[ 1381.162956] asm_common_interrupt+0x27/0x40
[ 1381.163260] RIP: 0010:cpuidle_enter_state+0xd9/0x620
[ 1381.163567] Code: 3d 64 69 5f 64 e8 27 29 6e ff 49 89 c7 0f 1f 44 00 00 31 ff e8 68 36 6e ff 80 7d d0 00 0f 85 5e 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 6a 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e5 03 00 00
[ 1381.164212] RSP: 0018:ffffa268c034fe38 EFLAGS: 00000246
[ 1381.164547] RAX: ffff91e81fdb0bc0 RBX: ffffc268bfd84d00 RCX: 0000000000000000
[ 1381.164920] RDX: 0000000000004322 RSI: 00000000280000d1 RDI: 0000000000000000
[ 1381.165265] RBP: ffffa268c034fe88 R08: 0000013bc2a7887a R09: 0000000000000000
[ 1381.165631] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff9ced4200
[ 1381.165971] R13: 0000000000000003 R14: 0000000000000003 R15: 0000013bc2a7887a
[ 1381.166293] ? cpuidle_enter_state+0xc8/0x620
[ 1381.166614] cpuidle_enter+0x2e/0x50
[ 1381.166955] do_idle+0x20d/0x2b0
[ 1381.167273] cpu_startup_entry+0x20/0x30
[ 1381.167589] start_secondary+0x12a/0x180
[ 1381.167927] secondary_startup_64_no_verify+0xc2/0xcb
[ 1381.168247] </TASK>
 
  • Like
Reactions: toplus
Hi all !!!

It's primarily a bug report.
I have a stable situation: when LXC container shuts down, the Proxmox host hangs, until hardware reset.
Similar behavior is repeated on different hypervisors, so it's hardware independent problem.
This problem can be reproduced after several starts and stops of containers.

pveversion --verbose
proxmox-ve: 7.2-1 (running kernel: 5.15.64-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-5.15: 7.2-13
pve-kernel-helper: 7.2-13
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-3
libpve-guest-common-perl: 4.1-4
libpve-http-server-perl: 4.1-4
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-3
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-6
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1

[ 1381.120401] watchdog: BUG: soft lockup - CPU#30 stuck for 26s! [swapper/30:0]
[ 1381.121214] Modules linked in: xt_nat xt_MASQUERADE xt_tcpudp xt_mark ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_security iptable_nat iptable_mangle iptable_security veth ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter sctp ip6_udp_tunnel udp_tunnel nf_tables nfnetlink_cttimeout bonding tls openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_watchdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common isst_if_common ipmi_ssif skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ast drm_vram_helper drm_ttm_helper ttm irdma drm_kms_helper ice cec rapl rc_core i2c_algo_bit mei_me fb_sys_fops intel_cstate pcspkr efi_pstore joydev input_leds ib_uverbs syscopyarea acpi_ipmi ioatdma sysfillrect
[ 1381.121264] sysimgblt mei dca intel_pch_thermal ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi drm sunrpc ip_tables x_tables autofs4 hid_generic usbkbd usbmouse usbhid hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c simplefb crc32_pclmul xhci_pci xhci_pci_renesas nvme i2c_i801 i40e ahci i2c_smbus xhci_hcd lpc_ich nvme_core libahci wmi
[ 1381.130736] CPU: 30 PID: 0 Comm: swapper/30 Tainted: P O 5.15.64-1-pve #1
[ 1381.131469] Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 3.5 05/27/2021
[ 1381.132197] RIP: 0010:netdev_pick_tx+0xe3/0x310
[ 1381.132993] Code: 66 85 c0 0f 84 53 01 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 dd 01 00 00 39 c3 0f 87 55 01 00 00 29 d8 39 c3 0f 87 4b 01 00 00 <eb> f4 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 4e 01
[ 1381.134501] RSP: 0018:ffffa268cd080618 EFLAGS: 00000297
[ 1381.135213] RAX: 000000000000001e RBX: 0000000000000000 RCX: 000000000000003c
[ 1381.135922] RDX: 0000000000000000 RSI: ffff91d0edc05e00 RDI: ffff91d08717e000
[ 1381.136658] RBP: ffffa268cd080658 R08: 0000000000000000 R09: ffffa268cd080510
[ 1381.137440] R10: ffff91d087481000 R11: ffffa268cd0809a0 R12: ffff91d08717e000
[ 1381.138161] R13: 00000000ffffffff R14: ffff91d0edc05e00 R15: ffff91d08717e000
[ 1381.138826] FS: 0000000000000000(0000) GS:ffff91e81fd80000(0000) knlGS:0000000000000000
[ 1381.139508] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1381.140177] CR2: 00007f77c2899b60 CR3: 000000271a410005 CR4: 00000000007726e0
[ 1381.140892] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1381.141748] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1381.142596] PKRU: 55555554
[ 1381.143296] Call Trace:
[ 1381.143936] <IRQ>
[ 1381.144616] netdev_core_pick_tx+0xa4/0xb0
[ 1381.145286] __dev_queue_xmit+0x1b8/0xb30
[ 1381.145921] ? netif_rx_internal+0x3a/0x100
[ 1381.146527] dev_queue_xmit+0x10/0x20
[ 1381.147115] ovs_vport_send+0xab/0x170 [openvswitch]
[ 1381.147727] do_output+0x59/0x180 [openvswitch]
[ 1381.148324] do_execute_actions+0x1841/0x1b40 [openvswitch]
[ 1381.148960] ? __wake_up_common+0x7b/0x140
[ 1381.149617] ? __skb_flow_dissect+0x2f2/0x1920
[ 1381.150165] ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
[ 1381.150729] ovs_execute_actions+0x48/0x110 [openvswitch]
[ 1381.151268] ? ovs_execute_actions+0x48/0x110 [openvswitch]
[ 1381.151818] ovs_dp_process_packet+0xa1/0x1f0 [openvswitch]
[ 1381.152354] ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
[ 1381.152941] ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
[ 1381.153531] ? ovs_flow_key_extract+0x2da/0x350 [openvswitch]
[ 1381.154049] ovs_vport_receive+0x77/0xd0 [openvswitch]
[ 1381.154560] ? __fib_validate_source+0x245/0x4b0
[ 1381.155038] ? kmem_cache_alloc+0x1ab/0x2f0
[ 1381.155525] ? cpumask_next_and+0x24/0x30
[ 1381.155974] ? update_sd_lb_stats.constprop.0+0x130/0xca0
[ 1381.156439] netdev_frame_hook+0xdf/0x1b0 [openvswitch]
[ 1381.156936] ? netdev_create+0x40/0x40 [openvswitch]
[ 1381.157401] __netif_receive_skb_core+0x236/0xef0
[ 1381.157871] __netif_receive_skb_list_core+0x107/0x260
[ 1381.158283] netif_receive_skb_list_internal+0x1a1/0x2c0
[ 1381.158713] ? dev_gro_receive+0x2d6/0x7b0
[ 1381.159080] ? kmem_cache_alloc+0x1ab/0x2f0
[ 1381.159434] ? __build_skb+0x26/0x60
[ 1381.159852] napi_complete_done+0x7a/0x1c0
[ 1381.160230] i40e_napi_poll+0xc4a/0x1340 [i40e]
[ 1381.160644] __napi_poll+0x30/0x180
[ 1381.161004] net_rx_action+0x126/0x280
[ 1381.161352] __do_softirq+0xd6/0x2ea
[ 1381.161726] irq_exit_rcu+0x94/0xc0
[ 1381.162027] common_interrupt+0x8e/0xa0
[ 1381.162327] </IRQ>
[ 1381.162652] <TASK>
[ 1381.162956] asm_common_interrupt+0x27/0x40
[ 1381.163260] RIP: 0010:cpuidle_enter_state+0xd9/0x620
[ 1381.163567] Code: 3d 64 69 5f 64 e8 27 29 6e ff 49 89 c7 0f 1f 44 00 00 31 ff e8 68 36 6e ff 80 7d d0 00 0f 85 5e 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 6a 01 00 00 4d 63 ee 49 83 fd 09 0f 87 e5 03 00 00
[ 1381.164212] RSP: 0018:ffffa268c034fe38 EFLAGS: 00000246
[ 1381.164547] RAX: ffff91e81fdb0bc0 RBX: ffffc268bfd84d00 RCX: 0000000000000000
[ 1381.164920] RDX: 0000000000004322 RSI: 00000000280000d1 RDI: 0000000000000000
[ 1381.165265] RBP: ffffa268c034fe88 R08: 0000013bc2a7887a R09: 0000000000000000
[ 1381.165631] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff9ced4200
[ 1381.165971] R13: 0000000000000003 R14: 0000000000000003 R15: 0000013bc2a7887a
[ 1381.166293] ? cpuidle_enter_state+0xc8/0x620
[ 1381.166614] cpuidle_enter+0x2e/0x50
[ 1381.166955] do_idle+0x20d/0x2b0
[ 1381.167273] cpu_startup_entry+0x20/0x30
[ 1381.167589] start_secondary+0x12a/0x180
[ 1381.167927] secondary_startup_64_no_verify+0xc2/0xcb
[ 1381.168247] </TASK>

Same problem: https://forum.proxmox.com/threads/system-lockup-how-to-diagnose.121540/
It does not happens with kernel 5.13
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!