Ubuntu VMs freezing on node with AMD Ryzen 9 5900HX

davidroller

New Member
Oct 5, 2022
9
0
1
Hi there,

All Ubuntu VMs that I have migrated to my a new node with an AMD Ryzen 9 5900HX started freezing after a while. When I first started migrating the VMs to the new node I started getting soft locks error messages and VMs got unresponsive requiring a hard reboot.

Doing some researching i found people who fixed the similar issue changing the VirtIO SCSI controller to VirtIO SCSI single and also adding this additional parameters iothread,aio=threads. I have tried the same thing, but it didn't work for me. I'm no longer getting the soft locks error message, but the VM freezes and the only way to get it work is rebooting it.

Does anybody know could being causing this problem?

Below you can find more details about environment.

[vm]
agent: 1
boot: order=virtio0;ide2;net0
cores: 2
ide2: none,media=cdrom
memory: 2048
meta: creation-qemu=6.2.0,ctime=1650769938
name: fs-gua-001
net0: virtio=82:0C:61:53:FE:93,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-single,iothread,aio=threads
smbios1: uuid=aacad350-4b0b-40e8-9339-2e30d0d2bbdf
sockets: 1
virtio0: local-lvm:vm-107-disk-0,discard=on,format=raw,size=40G
vmgenid: 2da4787e-7308-47f8-a806-79a9a5da673e

[pve]
proxmox-ve: 7.2-1 (running kernel: 5.15.30-2-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-11
pve-kernel-5.15.60-1-pve: 5.15.60-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-3
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1
 
Hi,
proxmox-ve: 7.2-1 (running kernel: 5.15.30-2-pve)
...
pve-kernel-5.15.60-1-pve: 5.15.60-1
While you got the newest of our stable kernel series installed you still run a quite old one (released in April). I'd recommend rebooting into the new kernel, it fixed a lot of things over the last six months.
 
Hi,

While you got the newest of our stable kernel series installed you still run a quite old one (released in April). I'd recommend rebooting into the new kernel, it fixed a lot of things over the last six months.
Hi Thomas,
I forgot to mention it that was another try
 
  • Like
Reactions: mira
In general:
  • Is the bios/UEFI and all firmwares up-to-date?
  • You could try the opt-in 5.19 kernel: [1]
  • And/Or try with the AMD-microcode package: [2]

  • Are really only Ubuntu-VMs affected?
  • Did you test other OSs? E.g.: Debian and/or Windows?
  • The freezes occur ever and not only on live-migration?

[1] https://forum.proxmox.com/threads/opt-in-linux-5-19-kernel-for-proxmox-ve-7-x-available.115090
[2] https://wiki.debian.org/Microcode
Hi Neobin,

See my answers bellow.

Is the bios/UEFI and all firmwares up-to-date?
Yes
You could try the opt-in 5.19 kernel: [1]
I can try that later today.
And/Or try with the AMD-microcode package: [2]
I can try that later today.
Are really only Ubuntu-VMs affected?
Yes, i haven't migrated any Windows VM yet.
Did you test other OSs? E.g.: Debian and/or Windows?
I have only tested it Ubuntu VMs (20.04.5 LTS (GNU/Linux 5.4.0-126-generic x86_64)
The freezes occur ever and not only on live-migration?
It didn't freeze during the live-migration, but it started freezing a few minutes later.



I have just disabled PSS support on the BIOS and restored VM's settings to its original state. Everything looks fine until now and I'll updated you guys if this change resolved the issue.

[vm]
agent: 1
boot: order=virtio0;ide2;net0
cores: 2
ide2: none,media=cdrom
memory: 2048
meta: creation-qemu=6.2.0,ctime=1650769938
name: fs-gua-001
net0: virtio=82:0C:61:53:FE:93,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=aacad350-4b0b-40e8-9339-2e30d0d2bbdf
sockets: 1
virtio0: local-lvm:vm-107-disk-0,discard=on,format=raw,size=40G
vmgenid: 2da4787e-7308-47f8-a806-79a9a5da673e
 
Hi there,
The linux VM is being running smoothly since I disabled the PSS support on BIOS. I have also live migrated a Windows VM to this new node with no issue at all.

Bellow you can find current settings:

[pve]
proxmox-ve: 7.2-1 (running kernel: 5.15.30-2-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-11
pve-kernel-5.15.60-1-pve: 5.15.60-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-10
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-3
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1

[linux-vm]
agent: 1
boot: order=virtio0;ide2;net0
cores: 2
ide2: none,media=cdrom
memory: 2048
meta: creation-qemu=6.2.0,ctime=1650769938
name: fs-gua-001
net0: virtio=82:0C:61:53:FE:93,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=aacad350-4b0b-40e8-9339-2e30d0d2bbdf
sockets: 1
virtio0: local-lvm:vm-107-disk-0,discard=on,format=raw,size=40G
vmgenid: 2da4787e-7308-47f8-a806-79a9a5da673e

[windows-vm]
agent: 1
boot: order=ide0;ide2;net0
cores: 2
ide0: local-lvm:vm-106-disk-0,discard=on,format=raw,size=60G
ide2: none,media=cdrom
machine: pc-i440fx-6.2
memory: 3072
meta: creation-qemu=6.2.0,ctime=1650766082
name: fs-adfs-001
net0: virtio=B6:84:BA:EC:31:D8,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: win10
scsihw: virtio-scsi-pci
smbios1: uuid=9d80b912-a1b8-4dd1-8c4b-f968cc965e31
sockets: 1
vmgenid: ac431d79-8140-4265-b7dc-fc95fccd0ce9
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!