Poor VM performance

visor7230

New Member
Nov 17, 2024
9
3
3
Hi All,
I have been experiencing significant performance issues on my Proxmox server, specifically with new VMs. Despite low resource utilization on the host, operations like package updates (apt, dnf, yum) are painfully slow, often running at around 20 KB/s. I have difficultly testing beyond this since installing a simple package takes forever.

I have attached my fio benchmarks, maybe I am missing something?

System Details:

  • Dell R630
  • CPU: 2x Intel Xeon E5-2623 v3
  • Memory: 128 GB DDR4
  • Storage: RAID10 ZFS pool with SSDs
  • Proxmox Version: 8.3.3 - Tried with kernel 6.8 and the new 6.11
  • Network: Intel(R) 2P X520/2P I350 rNDC - 2 10g Ports setup in a active-backup bond

Symptoms:

  1. New VMs:
    • Updates via apt, dnf, and yum are extremely slow (~20 KB/s).
    • Example from a Debian VM:
      Code:
      [1116.820709] cloud-init[580]: Fetched 26.1 MB in 18min 27s (23.6 kB/s)
    • Slow performance persists even with both cloud-init-based setups and direct ISO installs.
    • Difficult to benchmark since the install of these machines take so long.
  2. Existing VMs:
    • Benchmarks inside Linux VMs are similar to the host's performance.
    • Windows VMs are very slow, almost unusable.
What I have checked:
  1. Host Resource Usage:
    • IO wait is low.
    • CPU and RAM usage are well within acceptable ranges.
      • Low to idle CPU usage
      • Less than half of my memory is in use
  2. Network:
    • Speedtests from the host and VMs show expected results.
    • iperf3 confirms network performance is within normal ranges.
  3. Storage:
    • FIO results attached. From what I can tell, my array is very performant.
  4. Logs: Checked journalctl, dmesg, and VM logs—nothing out of the ordinary.
What else should I be checking, and am I missing something?
 

Attachments

For some additional details, here is my vm config:
Code:
agent: 1
bios: ovmf
boot: order=virtio0;scsi1
cicustom: vendor=local:snippets/ubuntu.yaml
cipassword: REMOVED
ciuser: REMOVED
cores: 2
cpu: host
efidisk0: storage:vm-200-disk-0,pre-enrolled-keys=0,size=1M
ipconfig0: ip=dhcp
machine: q35
memory: 2048
meta: creation-qemu=9.0.2,ctime=1731108814
name: ubuntu-001
net0: virtio=BC:24:11:E6:BE:A0,bridge=Servers
numa: 0
ostype: l26
scsi1: storage:vm-200-cloudinit,media=cdrom,size=4M
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=de8b5549-aa38-4d89-9f5c-cb6c319c94be
sockets: 1
sshkeys: REMOVED
vga: serial0
virtio0: storage:vm-200-disk-1,discard=on,size=48G
vmgenid: 9c41f0dc-44e2-4e58-9ec8-86e4024ffbf7
 
ssd model ?
storage controller ?
SSDs are 1TB Samsung 850 Pros (Yes, I know - consumer SSD:))

Controller is a PERC H330 Mini in HBA mode


I've tried other disk arrays, as well as a single SSD, with a single VM on it and they all show the same symptoms with new VMs.

Other VMs on the same array perform as expected, and are very fast.
 
How long have you been installed and using 850 Pro ?
Because 850 Pro seems doesn't support "Deterministic read ZEROs after TRIM" so TRIM not supported by PERC H330.
I wouldn't use these disks without trim , they will too slowdown after some data written, it seems your case.

Try single SSD on embedded AHCI SATA , after format speed can be restored.
 
Last edited:
  • Like
Reactions: leesteken
How long have you been installed and using 850 Pro ?
Because 850 Pro seems doesn't support "Deterministic read ZEROs after TRIM" so TRIM not supported by PERC H330.
I wouldn't use these disks without trim , they will too slowdown after some data written, it seems your case.

Try single SSD on embedded AHCI SATA , after format speed can be restored.
I formatted and installed them a few weeks ago, but they are older.

The issue persists even if I disable that zfs datastore, and i/o waits are still very low, but VMs still have this performance issue.
 
The issue persists even if I disable that zfs datastore
it's nothing to do with the filesystem.
your ssd can't be trimmed by perc, so they will be slow over time + faster wearout.

have you tried directly on embedded sata controller ? on the fly, not possible from Bays.

another test is try with a real ssd like used datacenter.
 
it's nothing to do with the filesystem.
your ssd can't be trimmed by perc, so they will be slow over time + faster wearout.

have you tried directly on embedded sata controller ? on the fly, not possible from Bays.

another test is try with a real ssd like used datacenter.
I don't have any ports connected to the embedded sata controller unfortunately.
I can try to see if I can grab an SSD from work.

Why would benchmarks show excellent random write performance on the proxmox host, but VMs would be abysmally slow?
 
I was able to move a VM onto a standalone SSD on the sata controller, but it still has the same speed issue.

I shutdown the rest of my VMs so there wasn't a possibility for resource contention, but I still see the same issue :/
 
Can you try to Pin all your VM-Cores to the same Socket on your host?
Code:
numactl --hardware

So that all Cores of VM1 runs on Socket1, VM2 on Socket2 and so on?
Dont forget to shutdown/start the VM or LXC Container after you changed the config.

Just for testing, to rule that out.