Hi,
I am running proxmox with 2 1080 GPU and storage via NFS for vm disk and os via zfs
My details of setup
pve-manager/7.4-13/46c37d9c (running kernel: 5.15.74-1-pve)
24 x AMD Ryzen 9 5900X 12-Core Processor (1 Socket)
48 GB DDR4 (NON-ECC) + 1TB NVME SAMSUNG SSD
2 x 1080 GPU
ZFS for VM OS & Sub Disk from NFS
Randomly i got this issue and cpu got stuck and i was not able to ssh/gui access and had to reboot the host after waiting for some time.
Jun 24 18:43:20 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:20 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:30 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 4481s! [z_wr_int_1:589]
Jun 24 18:43:30 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:30 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
Jun 24 18:43:30 dev-proxmox kernel: CPU: 9 PID: 589 Comm: z_wr_int_1 Tainted: P D W O L 5.15.74-1-pve #1
Jun 24 18:43:30 dev-proxmox kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F36d 07/20/2022
Jun 24 18:43:30 dev-proxmox kernel: RIP: 0010:smp_call_function_many_cond+0x13f/0x360
Jun 24 18:43:30 dev-proxmox kernel: Code: c4 73 2d 4d 63 ec 48 8b 13 49 81 fd ff 1f 00 00 0f 87 e3 01 00 00 4a 03 14 ed e0 fa cb a2 8b 42 08 a8 01 74 09 f3 90 8b 42 08 <a8> 01 75 f7 eb bc 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc
Jun 24 18:43:30 dev-proxmox kernel: RSP: 0018:ffffbcc2027efa48 EFLAGS: 00000202
Jun 24 18:43:30 dev-proxmox kernel: RAX: 0000000000000011 RBX: ffff92c65ec71c00 RCX: 0000000000000011
Jun 24 18:43:30 dev-proxmox kernel: RDX: ffff92c65ee77bc0 RSI: 0000000000000000 RDI: ffff92bb40067668
Jun 24 18:43:30 dev-proxmox kernel: RBP: ffffbcc2027efab0 R08: 0000000000000000 R09: 0000000000000000
Jun 24 18:43:30 dev-proxmox kernel: R10: 0000000000000011 R11: fffffffffffe0000 R12: 0000000000000011
Jun 24 18:43:30 dev-proxmox kernel: R13: 0000000000000011 R14: 0000000000000001 R15: 0000000000000020
Jun 24 18:43:30 dev-proxmox kernel: FS: 0000000000000000(0000) GS:ffff92c65ec40000(0000) knlGS:0000000000000000
Jun 24 18:43:30 dev-proxmox kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 24 18:43:30 dev-proxmox kernel: CR2: 0000027047826000 CR3: 00000005ca6e2000 CR4: 0000000000750ee0
Jun 24 18:43:30 dev-proxmox kernel: PKRU: 55555554
Jun 24 18:43:30 dev-proxmox kernel: Call Trace:
Jun 24 18:43:30 dev-proxmox kernel: <TASK>
Jun 24 18:43:30 dev-proxmox kernel: ? __flush_tlb_all+0x30/0x30
Jun 24 18:43:30 dev-proxmox kernel: on_each_cpu_cond_mask+0x22/0x30
Jun 24 18:43:30 dev-proxmox kernel: flush_tlb_kernel_range+0x41/0xa0
Jun 24 18:43:30 dev-proxmox kernel: __purge_vmap_area_lazy+0xb9/0x700
Jun 24 18:43:30 dev-proxmox kernel: ? __cond_resched+0x1a/0x50
Jun 24 18:43:30 dev-proxmox kernel: free_vmap_area_noflush+0x2ef/0x330
Jun 24 18:43:30 dev-proxmox kernel: remove_vm_area+0x9e/0xb0
Jun 24 18:43:30 dev-proxmox kernel: __vunmap+0x93/0x2a0
Jun 24 18:43:30 dev-proxmox kernel: __vfree+0x22/0x70
Jun 24 18:43:30 dev-proxmox kernel: vfree+0x2c/0x50
Jun 24 18:43:30 dev-proxmox kernel: spl_slab_reclaim+0x172/0x1b0 [spl]
Jun 24 18:43:30 dev-proxmox kernel: spl_kmem_cache_free+0x187/0x200 [spl]
Jun 24 18:43:30 dev-proxmox kernel: zio_buf_free+0x33/0x80 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: abd_free+0x1cd/0x1e0 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_pop_transforms+0x88/0xa0 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_done+0x17f/0x1290 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_execute+0x95/0x160 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: taskq_thread+0x29f/0x4d0 [spl]
Jun 24 18:43:30 dev-proxmox kernel: ? wake_up_q+0x90/0x90
Jun 24 18:43:30 dev-proxmox kernel: ? zio_gang_tree_free+0x70/0x70 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: ? taskq_thread_spawn+0x60/0x60 [spl]
Jun 24 18:43:30 dev-proxmox kernel: kthread+0x12a/0x150
Jun 24 18:43:30 dev-proxmox kernel: ? set_kthread_struct+0x50/0x50
Jun 24 18:43:30 dev-proxmox kernel: ret_from_fork+0x22/0x30
Jun 24 18:43:30 dev-proxmox kernel: </TASK>
Jun 24 18:43:30 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:30 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:34 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 4511s! [kworker/11:0:56662]
Jun 24 18:43:34 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:34 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
Jun 24 18:43:34 dev-proxmox kernel: CPU: 11 PID: 56662 Comm: kworker/11:0 Tainted: P D W O L 5.15.74-1-pve #1
Jun 24 18:43:34 dev-proxmox kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F36d 07/20/2022
Jun 24 18:43:34 dev-proxmox kernel: Workqueue: events netstamp_clear
Jun 24 18:43:34 dev-proxmox kernel: RIP: 0010:smp_call_function_many_cond+0x13c/0x360
Jun 24 18:43:34 dev-proxmox kernel: Code: 01 41 89 c4 73 2d 4d 63 ec 48 8b 13 49 81 fd ff 1f 00 00 0f 87 e3 01 00 00 4a 03 14 ed e0 fa cb a2 8b 42 08 a8 01 74 09 f3 90 <8b> 42 08 a8 01 75 f7 eb bc 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f
Jun 24 18:43:34 dev-proxmox kernel: RSP: 0018:ffffbcc21fa3fcf0 EFLAGS: 00000202
Jun 24 18:43:34 dev-proxmox kernel: RAX: 0000000000000011 RBX: ffff92c65ecf1c00 RCX: 0000000000000011
Jun 24 18:43:34 dev-proxmox kernel: RDX: ffff92c65ee77c00 RSI: 0000000000000000 RDI: ffff92bb40e5cd00
Jun 24 18:43:34 dev-proxmox kernel: RBP: ffffbcc21fa3fd58 R08: 0000000000000000 R09: 0000000000000000
Jun 24 18:43:34 dev-proxmox kernel: R10: 0000000000000011 R11: fffffffffffe0000 R12: 0000000000000011
Jun 24 18:43:34 dev-proxmox kernel: R13: 0000000000000011 R14: 0000000000000001 R15: 0000000000000020
Jun 24 18:43:34 dev-proxmox kernel: FS: 0000000000000000(0000) GS:ffff92c65ecc0000(0000) knlGS:0000000000000000
Jun 24 18:43:34 dev-proxmox kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 24 18:43:34 dev-proxmox kernel: CR2: 0000000000e203c0 CR3: 0000000946810000 CR4: 0000000000750ee0
Jun 24 18:43:34 dev-proxmox kernel: PKRU: 55555554
Jun 24 18:43:34 dev-proxmox kernel: Call Trace:
Jun 24 18:43:34 dev-proxmox kernel: <TASK>
Jun 24 18:43:34 dev-proxmox kernel: ? text_poke_loc_init+0x190/0x190
Jun 24 18:43:34 dev-proxmox kernel: on_each_cpu_cond_mask+0x22/0x30
Jun 24 18:43:34 dev-proxmox kernel: text_poke_bp_batch+0xb2/0x270
Jun 24 18:43:34 dev-proxmox kernel: text_poke_finish+0x1f/0x40
Jun 24 18:43:34 dev-proxmox kernel: arch_jump_label_transform_apply+0x1a/0x30
Jun 24 18:43:34 dev-proxmox kernel: __jump_label_update+0xf3/0x140
Jun 24 18:43:34 dev-proxmox kernel: jump_label_update+0xba/0xe0
Jun 24 18:43:34 dev-proxmox kernel: static_key_enable_cpuslocked+0x77/0xa0
Jun 24 18:43:34 dev-proxmox kernel: static_key_enable+0x1b/0x30
Jun 24 18:43:34 dev-proxmox kernel: netstamp_clear+0x2d/0x40
Jun 24 18:43:34 dev-proxmox kernel: process_one_work+0x22b/0x3d0
Jun 24 18:43:34 dev-proxmox kernel: worker_thread+0x53/0x420
Jun 24 18:43:34 dev-proxmox kernel: ? process_one_work+0x3d0/0x3d0
Jun 24 18:43:34 dev-proxmox kernel: kthread+0x12a/0x150
Jun 24 18:43:34 dev-proxmox kernel: ? set_kthread_struct+0x50/0x50
Jun 24 18:43:34 dev-proxmox kernel: ret_from_fork+0x22/0x30
Jun 24 18:43:34 dev-proxmox kernel: </TASK>
Jun 24 18:43:40 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:40 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:42 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 4519s! [kcompactd0:164]
Jun 24 18:43:42 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:42 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
I now changed (new i am yet to test) -- i will update if this issue happens again
sata0: local-zfs:vm-101-disk-1,aio=threads,cache=writeback,discard=on,size=100G,snapshot=1,ssd=1
sata1: local:100/vm100disk.qcow2,aio=threads,backup=0,cache=writeback,discard=on,snapshot=1
scsihw: virtio-scsi-pci
args: -smp '8,cores=4,threads=2,sockets=1,maxcpus=8' -cpu 'host,-hypervisor,topoext=on,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt'
agent: 1
balloon: 0
bios: ovmf
boot: order=sata0
cores: 8
cpu: host
Please help, what is the issue i am not getting
I am running proxmox with 2 1080 GPU and storage via NFS for vm disk and os via zfs
My details of setup
pve-manager/7.4-13/46c37d9c (running kernel: 5.15.74-1-pve)
24 x AMD Ryzen 9 5900X 12-Core Processor (1 Socket)
48 GB DDR4 (NON-ECC) + 1TB NVME SAMSUNG SSD
2 x 1080 GPU
ZFS for VM OS & Sub Disk from NFS
Randomly i got this issue and cpu got stuck and i was not able to ssh/gui access and had to reboot the host after waiting for some time.
Jun 24 18:43:20 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:20 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:30 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 4481s! [z_wr_int_1:589]
Jun 24 18:43:30 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:30 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
Jun 24 18:43:30 dev-proxmox kernel: CPU: 9 PID: 589 Comm: z_wr_int_1 Tainted: P D W O L 5.15.74-1-pve #1
Jun 24 18:43:30 dev-proxmox kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F36d 07/20/2022
Jun 24 18:43:30 dev-proxmox kernel: RIP: 0010:smp_call_function_many_cond+0x13f/0x360
Jun 24 18:43:30 dev-proxmox kernel: Code: c4 73 2d 4d 63 ec 48 8b 13 49 81 fd ff 1f 00 00 0f 87 e3 01 00 00 4a 03 14 ed e0 fa cb a2 8b 42 08 a8 01 74 09 f3 90 8b 42 08 <a8> 01 75 f7 eb bc 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc
Jun 24 18:43:30 dev-proxmox kernel: RSP: 0018:ffffbcc2027efa48 EFLAGS: 00000202
Jun 24 18:43:30 dev-proxmox kernel: RAX: 0000000000000011 RBX: ffff92c65ec71c00 RCX: 0000000000000011
Jun 24 18:43:30 dev-proxmox kernel: RDX: ffff92c65ee77bc0 RSI: 0000000000000000 RDI: ffff92bb40067668
Jun 24 18:43:30 dev-proxmox kernel: RBP: ffffbcc2027efab0 R08: 0000000000000000 R09: 0000000000000000
Jun 24 18:43:30 dev-proxmox kernel: R10: 0000000000000011 R11: fffffffffffe0000 R12: 0000000000000011
Jun 24 18:43:30 dev-proxmox kernel: R13: 0000000000000011 R14: 0000000000000001 R15: 0000000000000020
Jun 24 18:43:30 dev-proxmox kernel: FS: 0000000000000000(0000) GS:ffff92c65ec40000(0000) knlGS:0000000000000000
Jun 24 18:43:30 dev-proxmox kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 24 18:43:30 dev-proxmox kernel: CR2: 0000027047826000 CR3: 00000005ca6e2000 CR4: 0000000000750ee0
Jun 24 18:43:30 dev-proxmox kernel: PKRU: 55555554
Jun 24 18:43:30 dev-proxmox kernel: Call Trace:
Jun 24 18:43:30 dev-proxmox kernel: <TASK>
Jun 24 18:43:30 dev-proxmox kernel: ? __flush_tlb_all+0x30/0x30
Jun 24 18:43:30 dev-proxmox kernel: on_each_cpu_cond_mask+0x22/0x30
Jun 24 18:43:30 dev-proxmox kernel: flush_tlb_kernel_range+0x41/0xa0
Jun 24 18:43:30 dev-proxmox kernel: __purge_vmap_area_lazy+0xb9/0x700
Jun 24 18:43:30 dev-proxmox kernel: ? __cond_resched+0x1a/0x50
Jun 24 18:43:30 dev-proxmox kernel: free_vmap_area_noflush+0x2ef/0x330
Jun 24 18:43:30 dev-proxmox kernel: remove_vm_area+0x9e/0xb0
Jun 24 18:43:30 dev-proxmox kernel: __vunmap+0x93/0x2a0
Jun 24 18:43:30 dev-proxmox kernel: __vfree+0x22/0x70
Jun 24 18:43:30 dev-proxmox kernel: vfree+0x2c/0x50
Jun 24 18:43:30 dev-proxmox kernel: spl_slab_reclaim+0x172/0x1b0 [spl]
Jun 24 18:43:30 dev-proxmox kernel: spl_kmem_cache_free+0x187/0x200 [spl]
Jun 24 18:43:30 dev-proxmox kernel: zio_buf_free+0x33/0x80 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: abd_free+0x1cd/0x1e0 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_pop_transforms+0x88/0xa0 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_done+0x17f/0x1290 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: zio_execute+0x95/0x160 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: taskq_thread+0x29f/0x4d0 [spl]
Jun 24 18:43:30 dev-proxmox kernel: ? wake_up_q+0x90/0x90
Jun 24 18:43:30 dev-proxmox kernel: ? zio_gang_tree_free+0x70/0x70 [zfs]
Jun 24 18:43:30 dev-proxmox kernel: ? taskq_thread_spawn+0x60/0x60 [spl]
Jun 24 18:43:30 dev-proxmox kernel: kthread+0x12a/0x150
Jun 24 18:43:30 dev-proxmox kernel: ? set_kthread_struct+0x50/0x50
Jun 24 18:43:30 dev-proxmox kernel: ret_from_fork+0x22/0x30
Jun 24 18:43:30 dev-proxmox kernel: </TASK>
Jun 24 18:43:30 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:30 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:34 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 4511s! [kworker/11:0:56662]
Jun 24 18:43:34 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:34 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
Jun 24 18:43:34 dev-proxmox kernel: CPU: 11 PID: 56662 Comm: kworker/11:0 Tainted: P D W O L 5.15.74-1-pve #1
Jun 24 18:43:34 dev-proxmox kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F36d 07/20/2022
Jun 24 18:43:34 dev-proxmox kernel: Workqueue: events netstamp_clear
Jun 24 18:43:34 dev-proxmox kernel: RIP: 0010:smp_call_function_many_cond+0x13c/0x360
Jun 24 18:43:34 dev-proxmox kernel: Code: 01 41 89 c4 73 2d 4d 63 ec 48 8b 13 49 81 fd ff 1f 00 00 0f 87 e3 01 00 00 4a 03 14 ed e0 fa cb a2 8b 42 08 a8 01 74 09 f3 90 <8b> 42 08 a8 01 75 f7 eb bc 48 83 c4 40 5b 41 5c 41 5d 41 5e 41 5f
Jun 24 18:43:34 dev-proxmox kernel: RSP: 0018:ffffbcc21fa3fcf0 EFLAGS: 00000202
Jun 24 18:43:34 dev-proxmox kernel: RAX: 0000000000000011 RBX: ffff92c65ecf1c00 RCX: 0000000000000011
Jun 24 18:43:34 dev-proxmox kernel: RDX: ffff92c65ee77c00 RSI: 0000000000000000 RDI: ffff92bb40e5cd00
Jun 24 18:43:34 dev-proxmox kernel: RBP: ffffbcc21fa3fd58 R08: 0000000000000000 R09: 0000000000000000
Jun 24 18:43:34 dev-proxmox kernel: R10: 0000000000000011 R11: fffffffffffe0000 R12: 0000000000000011
Jun 24 18:43:34 dev-proxmox kernel: R13: 0000000000000011 R14: 0000000000000001 R15: 0000000000000020
Jun 24 18:43:34 dev-proxmox kernel: FS: 0000000000000000(0000) GS:ffff92c65ecc0000(0000) knlGS:0000000000000000
Jun 24 18:43:34 dev-proxmox kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 24 18:43:34 dev-proxmox kernel: CR2: 0000000000e203c0 CR3: 0000000946810000 CR4: 0000000000750ee0
Jun 24 18:43:34 dev-proxmox kernel: PKRU: 55555554
Jun 24 18:43:34 dev-proxmox kernel: Call Trace:
Jun 24 18:43:34 dev-proxmox kernel: <TASK>
Jun 24 18:43:34 dev-proxmox kernel: ? text_poke_loc_init+0x190/0x190
Jun 24 18:43:34 dev-proxmox kernel: on_each_cpu_cond_mask+0x22/0x30
Jun 24 18:43:34 dev-proxmox kernel: text_poke_bp_batch+0xb2/0x270
Jun 24 18:43:34 dev-proxmox kernel: text_poke_finish+0x1f/0x40
Jun 24 18:43:34 dev-proxmox kernel: arch_jump_label_transform_apply+0x1a/0x30
Jun 24 18:43:34 dev-proxmox kernel: __jump_label_update+0xf3/0x140
Jun 24 18:43:34 dev-proxmox kernel: jump_label_update+0xba/0xe0
Jun 24 18:43:34 dev-proxmox kernel: static_key_enable_cpuslocked+0x77/0xa0
Jun 24 18:43:34 dev-proxmox kernel: static_key_enable+0x1b/0x30
Jun 24 18:43:34 dev-proxmox kernel: netstamp_clear+0x2d/0x40
Jun 24 18:43:34 dev-proxmox kernel: process_one_work+0x22b/0x3d0
Jun 24 18:43:34 dev-proxmox kernel: worker_thread+0x53/0x420
Jun 24 18:43:34 dev-proxmox kernel: ? process_one_work+0x3d0/0x3d0
Jun 24 18:43:34 dev-proxmox kernel: kthread+0x12a/0x150
Jun 24 18:43:34 dev-proxmox kernel: ? set_kthread_struct+0x50/0x50
Jun 24 18:43:34 dev-proxmox kernel: ret_from_fork+0x22/0x30
Jun 24 18:43:34 dev-proxmox kernel: </TASK>
Jun 24 18:43:40 dev-proxmox pvestatd[2115]: VM 100 qmp command failed - VM 100 qmp command 'query-proxmox-support' failed - unable to connect to VM 100 qmp socket - timeout after 51 retries
Jun 24 18:43:40 dev-proxmox pvestatd[2115]: status update time (8.029 seconds)
Jun 24 18:43:42 dev-proxmox kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 4519s! [kcompactd0:164]
Jun 24 18:43:42 dev-proxmox kernel: Modules linked in: 8021q garp mrp tcp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm crct10dif_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel snd_hda_codec crypto_simd snd_hda_core cryptd snd_hwdep rapl snd_pcm gigabyte_wmi snd_timer wmi_bmof pcspkr ccp efi_pstore k10temp mxm_wmi snd soundcore vhost_net mac_hid vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfio_pci vfio_pci_core vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic
Jun 24 18:43:42 dev-proxmox kernel: xor zstd_compress raid6_pq libcrc32c simplefb hid_generic usbkbd usbhid hid ahci xhci_pci crc32_pclmul i2c_piix4 xhci_pci_renesas libahci ixgbe xhci_hcd igb nvme i2c_algo_bit xfrm_algo dca mdio nvme_core wmi
I now changed (new i am yet to test) -- i will update if this issue happens again
sata0: local-zfs:vm-101-disk-1,aio=threads,cache=writeback,discard=on,size=100G,snapshot=1,ssd=1
sata1: local:100/vm100disk.qcow2,aio=threads,backup=0,cache=writeback,discard=on,snapshot=1
scsihw: virtio-scsi-pci
args: -smp '8,cores=4,threads=2,sockets=1,maxcpus=8' -cpu 'host,-hypervisor,topoext=on,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt'
agent: 1
balloon: 0
bios: ovmf
boot: order=sata0
cores: 8
cpu: host
Please help, what is the issue i am not getting