vm hang after live migration since upgrade to pve 7.2 with "BUG: soft lockup"

abma

Active Member
Feb 20, 2021
88
10
28
since the upgrade to pve 7.2 linux vms hang after live migration, see the screenshot. the vm is pingable but ssh login doesn't work any more.

an example vm config:

Code:
agent: 1,fstrim_cloned_disks=1
balloon: 16000
boot: order=scsi0
cores: 14
machine: q35
memory: 12000
name: serverhang
net0: virtio=E2:49:F1:7G:DB:65,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: cephrbd:vm-100-disk-0,discard=on,size=80G
scsi1: cephrbd:vm-100-disk-1,discard=on,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=...
sockets: 1
startup: up=90
vmgenid: ...

the os of the vm is debian 11 stable. any idea what could cause this? live migration worked fine with pve 7.1.

what i've noticed the clock of the vm must have been VERY wrong:

Code:
ls -lah /var/log/btmp
-rw-rw---- 1 root utmp 0 16. Mär 2256  /var/log/btmp

thats 16.03.2256


migration was between AMD EPYC 7502 and AMD EPYC 7662, installed software should be the same: "latest stable".
 

Attachments

Last edited:
We have similar problem with 7.2: Trigger CT migration cause physical pve becoming unresponsive and we must power reset to make it work again. :(
 
i don't understand what you mean. our physical host runs flawless, only the vm hangs for us. sounds like a different problem.

if so: please create a new thread.