I have a system running PVE 7.3 that is suffering from poor performance with VMs. Specifically, I have attempted to setup two different Debian-based VMs (one Ubuntu 22.04 server, one Home Assistant OS) and the performance has been so poor that I didn't even make it through the installer. As an example, just the boot sequence to enter the Ubuntu installer took at least 10 minutes and the HASS VM has been attempting to complete the boot sequence for at least 30. I have about 10 containers running on the machine without any performance issues.
I previously had this machine running with PVE 7.2, at which point I had VMs setup and running with no observable performance issues. The only differences between the previous, working setup, and the current setup are: 1) PVE 7.2 => 7.3 (clean install) and 2) a Quadro P400 has been added. I have not tried removing the GPU to see if that makes a difference, because it was a bit of a headache getting it installed (or, at least dealing with the ramifications of getting it installed).
I have reviewed a number of posts reporting laggy performance with VMs, but most of them seem to be related to desktop GUI performance; I haven't found one that seems to related to the particular issue I'm facing. With that said, one error that I saw while attempting to install Ubuntu was "watchdog: BUG: soft lockup - CPU#n stuck for Xs" (I haven't seen anything along these lines with the HASSOS VM), I found this post with a similar issue and have tried the suggested work arounds (virtio-scsi-single, io-thread, async-io: threads), but it didn't make a noticable difference.
I'm a bit mystified as to where to look next. Thanks for any thoughts.
Hardware:
Interestingly, hyperthreading is enabled in the BIOS, but isn't recognized by the OS.
First VM (Ubuntu 22.04):
Second VM (HomeAssistant):
I previously had this machine running with PVE 7.2, at which point I had VMs setup and running with no observable performance issues. The only differences between the previous, working setup, and the current setup are: 1) PVE 7.2 => 7.3 (clean install) and 2) a Quadro P400 has been added. I have not tried removing the GPU to see if that makes a difference, because it was a bit of a headache getting it installed (or, at least dealing with the ramifications of getting it installed).
I have reviewed a number of posts reporting laggy performance with VMs, but most of them seem to be related to desktop GUI performance; I haven't found one that seems to related to the particular issue I'm facing. With that said, one error that I saw while attempting to install Ubuntu was "watchdog: BUG: soft lockup - CPU#n stuck for Xs" (I haven't seen anything along these lines with the HASSOS VM), I found this post with a similar issue and have tried the suggested work arounds (virtio-scsi-single, io-thread, async-io: threads), but it didn't make a noticable difference.
I'm a bit mystified as to where to look next. Thanks for any thoughts.
Hardware:
- Lenovo TD340
- 2 x E5-2450 v2 (8 core, 16 thread)
- 96 GB DDR3
- OS disks are 500GB HDD in ZFS mirror
- VM storage disks are 6 x Intel 480GB SSD in RAID Z2
- Network cards: Intel 4 x 1GBE, Intel 2 x 10GBE
Bash:
root@pve:~# pveversion -v
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.3.2-1
proxmox-backup-file-restore: 2.3.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve2
Bash:
root@pve:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
Stepping: 4
CPU MHz: 2500.000
CPU max MHz: 2500.0000
CPU min MHz: 1200.0000
BogoMIPS: 4987.75
Virtualization: VT-x
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 4 MiB
L3 cache: 40 MiB
NUMA node0 CPU(s): 0-15
Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT disabled
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Unknown: No mitigations
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, RSB filling, PBRSB-eIBRS N
ot affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflu
sh dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm const
ant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperf
mperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdc
m pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexp
riority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flu
sh_l1d
Interestingly, hyperthreading is enabled in the BIOS, but isn't recognized by the OS.
First VM (Ubuntu 22.04):
Bash:
root@pve:~# qm config 107
balloon: 2048
boot: order=scsi0;ide2;net0
cores: 2
cpu: host
ide2: local:iso/ubuntu-22.04.1-live-server-amd64.iso,media=cdrom,size=1440306K
machine: q35
memory: 4096
meta: creation-qemu=7.1.0,ctime=1675180854
name: vm-docker
net0: virtio=CA:A0:37:52:5C:36,bridge=vmbr10,firewall=1
numa: 0
ostype: l26
scsi0: fast-os:vm-107-disk-0,aio=threads,discard=on,iothread=1,size=64G
scsihw: virtio-scsi-single
smbios1: uuid=2c359d8b-1f98-457f-9b96-8ead931f7c50
sockets: 2
tablet: 0
vmgenid: 521d9798-fe53-4111-a2fa-d7ddc52a219a
Second VM (HomeAssistant):
Bash:
root@pve:~# qm config 116
agent: enabled=1
bios: ovmf
boot: order=virtio0
cores: 4
efidisk0: fast-os:vm-116-disk-0,size=1M
machine: q35
memory: 4096
meta: creation-qemu=7.1.0,ctime=1675520478
name: vm-home-assistant
net0: virtio=F2:8C:EA:43:64:FC,bridge=vmbr10,firewall=1
net1: virtio=5A:25:76:42:E1:15,bridge=vmbr30,firewall=1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=e51488f8-a492-4fb7-824b-b371a1c69732
tablet: 0
virtio0: fast-os:vm-116-disk-1,aio=threads,cache=writeback,iothread=1,size=32G
vmgenid: 7802d576-f0a8-4408-875f-709232306b5e
Last edited: