VM Soft lockups when heavy load in PVE 8.0 + kernel 6.x

pandada8

Active Member
Jun 25, 2018
13
2
43
28
I recently came across VM soft lockups when running heavy loads with compute and io.
It began with PVE 7.4 and with 6.1 kernel. When the PVE 8.0 was released. I upgraded all nodes with kernel 6.2 but nothing seems to change .
The CPU running is two socket 7742 & 7702 systems. I tried upgrade the cpu microcode using unstable repo with `patch_level=0x08301072`. It seems getting things a little better and still a lot of soft lockups.
 
Switch to threads seems fix the softlockup issue. Still, I hope io_uring can be enabled one day since it seems have better io performance
 
  • Like
Reactions: Zerstoiber
So.....
I switch to new 6.2.16-5-pve kernel and the softlockup comes backup !

all locked up vm is running with aio=native and virtio-single

e.g.

Code:
acpi: 1
agent: enabled=1
bios: seabios
boot: order=scsi0
cicustom: vendor=cephfs:snippets/ci-k8s.yaml
cores: 64
cpu: host
ide2: hp6hdd:155/vm-155-cloudinit.raw,media=cdrom,size=4M
ipconfig0: ip=10.2.12.241/24,gw=10.2.12.1
machine: q35
memory: 262144
meta: creation-qemu=6.2.0,ctime=1650715383
name: ci-k8s-1
net0: virtio=3A:03:AD:06:8D:60,bridge=vmbr0,tag=1012
net1: virtio=9A:AF:79:97:6B:7D,bridge=bachang
numa: 0
ostype: l26
scsi0: hp6hdd:155/vm-155-disk-0.qcow2,aio=native,discard=on,iothread=1,size=10G
scsi1: hp9hdd:155/vm-155-disk-0.qcow2,aio=native,backup=0,discard=on,iothread=1,size=200G
scsihw: virtio-scsi-single
serial0: socket
smbios1: uuid=8d78b470-3dfa-4c23-b55e-9b4d04233123
sockets: 1
vmgenid: 97a33c7a-9a88-4af5-a3e7-a5ea1145c906

Host is AMD EPYC 7742 dual socket

Code:
pveversion --verbose
proxmox-ve: 8.0.1 (running kernel: 6.2.16-5-pve)
pve-manager: 8.0.3 (running version: 8.0.3/bbf3993334bfa916)
pve-kernel-6.2: 8.0.4
pve-kernel-5.15: 7.4-4
pve-kernel-6.1: 7.3-6
pve-kernel-6.2.16-5-pve: 6.2.16-6
pve-kernel-6.2.16-3-pve: 6.2.16-3
pve-kernel-6.1.15-1-pve: 6.1.15-1
pve-kernel-5.15.108-1-pve: 5.15.108-1
ceph: 17.2.6-pve1+3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown: residual config
ifupdown2: 3.2.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-3
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-network-perl: 0.8.1
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libqb0: 1.0.5-1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
openvswitch-switch: 3.1.0-2
proxmox-backup-client: 3.0.1-1
proxmox-backup-file-restore: 3.0.1-1
proxmox-kernel-helper: 8.0.2
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.2
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.7-1
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-3
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.6
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!