[SOLVED] kernel panic since update to Proxmox 7.1

CvH

Member
Oct 16, 2020
25
4
23
Germany
Hi, we see now a kernel panic while medium usage of a node that results in complete crash of the device.
Sadly not reproduce able, It happens 3-4 times a week at random times (full load, no load ...).
This never happen before the upgrade from 6.4 -> 7.1 (two weeks ago) so chances are good this is somehow connected.

Code:
[Fri Dec 10 08:38:19 2021] libceph: osd15 down
[Fri Dec 10 08:38:35 2021] libceph: osd13 down
[Fri Dec 10 08:38:36 2021] rbd: rbd6: encountered watch error: -107
[Fri Dec 10 08:38:37 2021] rbd: rbd4: encountered watch error: -107
[Fri Dec 10 08:39:43 2021] INFO: task jbd2/dm-9-8:649 blocked for more than 120 seconds.
[Fri Dec 10 08:39:43 2021]       Tainted: P           O      5.13.19-1-pve #1
[Fri Dec 10 08:39:43 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Dec 10 08:39:43 2021] task:jbd2/dm-9-8     state:D stack:    0 pid:  649 ppid:     2 flags:0x00004000
[Fri Dec 10 08:39:43 2021] Call Trace:
[Fri Dec 10 08:39:43 2021]  ? wbt_cleanup_cb+0x20/0x20
[Fri Dec 10 08:39:43 2021]  __schedule+0x2fa/0x910
[Fri Dec 10 08:39:43 2021]  ? wbt_cleanup_cb+0x20/0x20
[Fri Dec 10 08:39:43 2021]  schedule+0x4f/0xc0
[Fri Dec 10 08:39:43 2021]  io_schedule+0x46/0x70
[Fri Dec 10 08:39:43 2021]  rq_qos_wait+0xbd/0x150
[Fri Dec 10 08:39:43 2021]  ? sysv68_partition+0x280/0x280
[Fri Dec 10 08:39:43 2021]  ? wbt_cleanup_cb+0x20/0x20
[Fri Dec 10 08:39:43 2021]  wbt_wait+0x9b/0xe0
[Fri Dec 10 08:39:43 2021]  __rq_qos_throttle+0x28/0x40
[Fri Dec 10 08:39:43 2021]  blk_mq_submit_bio+0x119/0x590
[Fri Dec 10 08:39:43 2021]  submit_bio_noacct+0x2dc/0x4f0
[Fri Dec 10 08:39:43 2021]  submit_bio+0x4f/0x1b0
[Fri Dec 10 08:39:43 2021]  ? bio_add_page+0x6a/0x90
[Fri Dec 10 08:39:43 2021]  submit_bh_wbc+0x18d/0x1c0
[Fri Dec 10 08:39:43 2021]  submit_bh+0x13/0x20
[Fri Dec 10 08:39:43 2021]  jbd2_journal_commit_transaction+0x8ee/0x1910
[Fri Dec 10 08:39:43 2021]  kjournald2+0xa9/0x280
[Fri Dec 10 08:39:43 2021]  ? wait_woken+0x80/0x80
[Fri Dec 10 08:39:43 2021]  ? load_superblock.part.0+0xb0/0xb0
[Fri Dec 10 08:39:43 2021]  kthread+0x12b/0x150
[Fri Dec 10 08:39:43 2021]  ? set_kthread_struct+0x50/0x50
[Fri Dec 10 08:39:43 2021]  ret_from_fork+0x22/0x30
[Fri Dec 10 08:39:43 2021] INFO: task cfs_loop:1691 blocked for more than 120 seconds.
[Fri Dec 10 08:39:43 2021]       Tainted: P           O      5.13.19-1-pve #1
[Fri Dec 10 08:39:43 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Dec 10 08:39:43 2021] task:cfs_loop        state:D stack:    0 pid: 1691 ppid:     1 flags:0x00000000
...

Full log from that time
http://ix.io/3HyP

Pointed me to dm-9, so I replaced (different size, different model) the OS SSD just in case, but still get the same errors.
Code:
root@server:~# dmsetup info /dev/dm-9
Name:              pve-root
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 9
Number of targets: 1
UUID: LVM-oAidPr2xUSz6O90Zc1Exu5QLWukpvuTRFAhhr8ejUmP0mnBaYvW8SpZ210OkURKI

I am not sure if pve-root is handled different and why it could trigger a kernel panic.
Is this something known ? Is there something I could test ?
 
Please provide the output of pveversion -v and lscpu.

I assume you use `krbd` based on the output? Do you have the same issue if you disable it?
Are there any network issues perhaps?
 
root@server:~# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-7 (running version: 7.1-7/df5740ad)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.4: 6.4-7
pve-kernel-5.13.19-2-pve: 5.13.19-4
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.101-1-pve: 5.4.101-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.44-2-pve: 5.4.44-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 16.2.6-pve2
ceph-fuse: 16.2.6-pve2
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3

root@server:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 113
Model name: AMD Ryzen 9 3950X 16-Core Processor
Stepping: 0
Frequency boost: enabled
CPU MHz: 3500.000
CPU max MHz: 4761.2300
CPU min MHz: 2200.0000
BogoMIPS: 6999.98
Virtualization: AMD-V
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 8 MiB
L3 cache: 64 MiB
NUMA node0 CPU(s): 0-31
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx
16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd
mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nri
p_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es

Ceph is used for storage and additional a NVME for some LXCs, this works stable so far.
I can't easily disable ceph, I can't reproduce the problem so I need to wait 1-3 days.

We just see frequently restarts of the network bridges for some reason but that didn't causes any network outages. Also this should not lead to kernel panics ?

[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered disabled state
[Fri Dec 10 12:58:11 2021] device veth630ac32 entered promiscuous mode
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered forwarding state
[Fri Dec 10 12:58:11 2021] overlayfs: fs on '/var/lib/docker/overlay2/l/QJYTRIPTZFRVS5YXS4GDIPXL5A' does not support file handles, falling back to xino=off.
[Fri Dec 10 12:58:11 2021] overlayfs: fs on '/var/lib/docker/overlay2/l/QJYTRIPTZFRVS5YXS4GDIPXL5A' does not support file handles, falling back to xino=off.
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered disabled state
[Fri Dec 10 12:58:11 2021] device veth3e60bdc entered promiscuous mode
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered forwarding state
[Fri Dec 10 12:58:11 2021] eth0: renamed from veth18df3a5
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered disabled state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered disabled state
[Fri Dec 10 12:58:11 2021] IPv6: ADDRCONF(NETDEV_CHANGE): vethd02e999: link becomes ready
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 8(vethd02e999) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 8(vethd02e999) entered forwarding state
[Fri Dec 10 12:58:11 2021] eth0: renamed from veth8c59841
[Fri Dec 10 12:58:11 2021] IPv6: ADDRCONF(NETDEV_CHANGE): veth630ac32: link becomes ready
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 9(veth630ac32) entered forwarding state
[Fri Dec 10 12:58:11 2021] eth0: renamed from vethde0f4c4
[Fri Dec 10 12:58:11 2021] IPv6: ADDRCONF(NETDEV_CHANGE): veth3e60bdc: link becomes ready
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered blocking state
[Fri Dec 10 12:58:11 2021] br-411964f160bb: port 10(veth3e60bdc) entered forwarding state
[Fri Dec 10 12:58:13 2021] br-411964f160bb: port 10(veth3e60bdc) entered disabled state
[Fri Dec 10 12:58:13 2021] vethde0f4c4: renamed from eth0
[Fri Dec 10 12:58:13 2021] br-411964f160bb: port 10(veth3e60bdc) entered disabled state
[Fri Dec 10 12:58:13 2021] device veth3e60bdc left promiscuous mode
[Fri Dec 10 12:58:13 2021] br-411964f160bb: port 10(veth3e60bdc) entered disabled state
[Fri Dec 10 12:59:44 2021] br-411964f160bb: port 6(veth2a50e6b) entered disabled state
[Fri Dec 10 12:59:44 2021] vethdd8c67c: renamed from eth0
[Fri Dec 10 12:59:44 2021] br-411964f160bb: port 5(vethc3012d9) entered disabled state
[Fri Dec 10 12:59:44 2021] br-411964f160bb: port 8(vethd02e999) entered disabled state
[Fri Dec 10 12:59:44 2021] veth37f94c7: renamed from eth0
[Fri Dec 10 12:59:44 2021] veth18df3a5: renamed from eth0
 
Those are not panics, but stack traces of stuck tasks (D state, waiting on I/O).

Are the disks attached to a RAID controller?
Could you try the latest 5.11 kernel?
 
yes, testing 5.15 is also a good idea as this is probably the kernel for 7.2
 
They are connected to the mainbaord sata controller and no other components were switched or added in the meantime (6.4->7.1).


5.11?, 5.15 I guess ?
If 5.15 doesn't fix the issue, then 5.11 as well please.
 
I updated the effected server with 5.15, not crashed yet (~1day running).
Meanwhile a other server (still at 5.13) is crashed with a slightly different error also leading to pve-root

Code:
[Tue Dec 14 14:36:15 2021] INFO: task khugepaged:215 blocked for more than 120 seconds.
[Tue Dec 14 14:36:15 2021]       Tainted: P           O      5.13.19-1-pve #1
[Tue Dec 14 14:36:15 2021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Dec 14 14:36:15 2021] task:khugepaged      state:D stack:    0 pid:  215 ppid:     2 flags:0x00004000
[Tue Dec 14 14:36:15 2021] Call Trace:
[Tue Dec 14 14:36:15 2021]  ? wbt_cleanup_cb+0x20/0x20
[Tue Dec 14 14:36:15 2021]  __schedule+0x2fa/0x910
[Tue Dec 14 14:36:15 2021]  ? wbt_cleanup_cb+0x20/0x20
[Tue Dec 14 14:36:15 2021]  schedule+0x4f/0xc0
[Tue Dec 14 14:36:15 2021]  io_schedule+0x46/0x70
[Tue Dec 14 14:36:15 2021]  rq_qos_wait+0xbd/0x150
[Tue Dec 14 14:36:15 2021]  ? sysv68_partition+0x280/0x280
[Tue Dec 14 14:36:15 2021]  ? wbt_cleanup_cb+0x20/0x20
[Tue Dec 14 14:36:15 2021]  ? do_madvise+0x50/0x50
[Tue Dec 14 14:36:15 2021]  wbt_wait+0x9b/0xe0
[Tue Dec 14 14:36:15 2021]  __rq_qos_throttle+0x28/0x40
[Tue Dec 14 14:36:15 2021]  blk_mq_submit_bio+0x119/0x590
[Tue Dec 14 14:36:15 2021]  ? do_madvise+0x50/0x50
[Tue Dec 14 14:36:15 2021]  submit_bio_noacct+0x2dc/0x4f0
[Tue Dec 14 14:36:15 2021]  ? unlock_page_memcg+0x46/0x80
[Tue Dec 14 14:36:15 2021]  ? do_madvise+0x50/0x50
[Tue Dec 14 14:36:15 2021]  submit_bio+0x4f/0x1b0
[Tue Dec 14 14:36:15 2021]  ? do_madvise+0x50/0x50
[Tue Dec 14 14:36:15 2021]  ? bio_add_page+0x6a/0x90
[Tue Dec 14 14:36:15 2021]  __swap_writepage+0x19c/0x490
[Tue Dec 14 14:36:15 2021]  swap_writepage+0x35/0x90
[Tue Dec 14 14:36:15 2021]  pageout+0xf7/0x310
[Tue Dec 14 14:36:15 2021]  shrink_page_list+0x8e4/0xcd0
[Tue Dec 14 14:36:15 2021]  shrink_inactive_list+0x163/0x420
[Tue Dec 14 14:36:15 2021]  shrink_lruvec+0x445/0x6c0
[Tue Dec 14 14:36:15 2021]  shrink_node+0x2b2/0x6e0
[Tue Dec 14 14:36:15 2021]  do_try_to_free_pages+0xd7/0x4d0
[Tue Dec 14 14:36:15 2021]  try_to_free_pages+0xf5/0x1b0
[Tue Dec 14 14:36:15 2021]  __alloc_pages_slowpath.constprop.0+0x3d6/0xd80
[Tue Dec 14 14:36:15 2021]  __alloc_pages+0x30e/0x330
[Tue Dec 14 14:36:15 2021]  khugepaged+0x1070/0x20b0
[Tue Dec 14 14:36:15 2021]  ? wait_woken+0x80/0x80
[Tue Dec 14 14:36:15 2021]  ? collapse_pte_mapped_thp+0x440/0x440
[Tue Dec 14 14:36:15 2021]  kthread+0x12b/0x150
[Tue Dec 14 14:36:15 2021]  ? set_kthread_struct+0x50/0x50
[Tue Dec 14 14:36:15 2021]  ret_from_fork+0x22/0x30
[Tue Dec 14 14:36:15 2021] INFO: task jbd2/dm-3-8:618 blocked for more than 120 seconds.

full dmesg of that timeframe

server1:~# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-1-pve)
pve-manager: 7.1-6 (running version: 7.1-6/4e61e21c)
pve-kernel-5.13: 7.1-4
pve-kernel-helper: 7.1-4
pve-kernel-5.4: 6.4-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.0: 6.0-11
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.4.143-1-pve: 5.4.143-1
pve-kernel-5.4.128-1-pve: 5.4.128-2
pve-kernel-5.4.119-1-pve: 5.4.119-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.101-1-pve: 5.4.101-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.78-1-pve: 5.4.78-1
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.44-1-pve: 5.4.44-1
pve-kernel-4.15: 5.4-9
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.21-3-pve: 5.0.21-7
pve-kernel-4.15.18-21-pve: 4.15.18-48
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph: 16.2.6-pve2
ceph-fuse: 16.2.6-pve2
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-3
libpve-storage-perl: 7.0-15
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.4-3
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3


root@server1:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 8
Model name: AMD Ryzen Threadripper 2950X 16-Core Processor
Stepping: 2
Frequency boost: enabled
CPU MHz: 3500.000
CPU max MHz: 3500.0000
CPU min MHz: 2200.0000
BogoMIPS: 6999.24
Virtualization: AMD-V
L1d cache: 512 KiB
L1i cache: 1 MiB
L2 cache: 8 MiB
L3 cache: 32 MiB
NUMA node0 CPU(s): 0-31
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB conditional, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse
3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmca
ll fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload v
gif overflow_recov succor smca sme sev sev_es
 
Last edited:
Got IO issues with kernel 5.13.19-6-pve on a node with nvme drive installed. I had this issue some time ago on some other kernel then solved it by moving to the updated kernel version. It appeared again with 5.13.19-6 one. Upgraded to 5.15 just 6 hours ago, all is good so far. Will monitor it.
 
PS : some more findings

Code:
ls -la /etc/pve/qemu-server/
gives you an idea, which vms are still running on this host

Code:
# still working
qm migrate <id> <target server>
ps ax blocks just before the stale VM, but you can still loop over the procfs to get process info, i.e.
Code:
for i in `ls /proc |egrep "^[0-9]"`; do echo -n $i:`grep Name $i/status` ;echo " "; done
 
we use ubuntu 22.. kernel panic.. after stop and start again it back to normal..any idea why? nothing changed..only reboot