IO Delay problems

Jon W

New Member
Aug 28, 2024
4
0
1
I am using a single proxmox host in a home setting to run a few VMs and presumed a not very powerful system needed. But I am experiencing issues with just a single guest vm running headless debian 12, and would appreciate any feedback.

Proxmox is installed on a Dell Precision 3440 with Xeon W-1250 and 64GB ECC.
Standard install on a zfs root using 2 drives in a mirror.
Both drives are Crucial P3 500GB SSD in m.2 slots on the motherboard.
Guest VMs also use the mirrored pair for storage.

With only a single headless debian 12 guest vm running (virtio raw disk image), if I apt upgrade or write anything to the disk it creates significant IO delay on the host.
Downloading a 10GB file as test, almost immediately spikes IO delay 10-20% and within a minute it will average 50-80% delay.
Downloading same file on proxmox host itself causes IO delay to hover at around 10%, also seems unexpected and high.

I understand these are not the greatest drives and this could be the cause, but I was not expecting something like downloading a single file to cause this degree of delay. Given the setup I describe, is this normal?

--------------------

Attempting to figure this out I tested using another not great drive I had on hand, Phison 512GB S10C sata ssd. This drive should be even less performant but possible could have dram cache (unknown) where the crucial does not.

I connected this sata ssd to motherboard, created a single ext4 partition and added it to proxmox. If I clone the debian 12 vm on to this new storage and download 10GB file IO delay is imperceptible. I am confused that a lesser drive does not have this problem at all.

Thinking it could be zfs. I nuked this drive and created a zfs pool (all default settings) with just this drive and created a single dataset for storage and added to proxmox.
Cloning the debian vm again to this now zfs storage and downloading the 10GB file, IO delay intermittently spikes to some number between 0-5% for a second but drops right back down. This seems negligible and acceptable.

--------------------

It is hard to understand how an even worse drive does not cause the IO delay. If I replace the crucial nvme drives with something better, should I expect this problem to go away?
It seems like I should get additional drive(s) to store the guest VMs anyhow. Is it more likely the crucial nvme drives are causing me IO delay issues or more likely because I am using the same drives for both booting and VM storage?

Any and all input or advice much appreciated.
 
Thanks. I had a strong sense that might be the case. It is so hard to believe just how poorly these drives seem to perform under (any) load. I understand I should separate the vm storage from the boot drive and I know zfs has overhead, but I guess its more the drives themselves. I was just not expecting just one guest running downloading a single file capable of rendering the host unresponsive. I guess its time to look for some better drives just for vm storage and maybe the boot drives as well.