IODelay on restore / migrate

DJP

Active Member
Jan 23, 2011
10
2
43
Hello everyone,

I'm running proxmox with lvm-thin on local hard-drive and I can't find any way to fix the huge iodelay on server when receiving/restoring/migrating a VM.

I've got this on Proxmox 5 and 6, with different servers generations : DELL R620 or R640 with PERC + Mechanical HD or perc + ssd (enterprise grade, not consumer), it's still the same. RAID1 or RAID5.

When using pveperf, it only check root (where is installed proxmox) harddrive. Do you know a way to bench disk local-lvm disk read/write ? And what should I have as a "minimum" value ?

I've modified the datacenter.cfg on proxmox6 to limit to 100Mo/sec which seems not to be gigantic to me on SSD, and I still have something between 5 to 15 in IOdelay. Thus, each vm inside the proxmox host hang or have huge slow down when trying to write on disk.

On proxmox6, I noticed at first glance it's when the disk image is zeroing. When copying image file, there is no iodelay. I've made a test on proxmox5, and it seems it's the same (I've less logs to tell but seems to be so...).

Can't we limit the bandwidth of the lvm-thin zeroing since it seems to be the culprit ? And not only the disk image copying (which is limited by cstream as I've seen) ?

Thanks for your help, I can't find anything of the forum that works and I guess i'm not alone (!?) :-(

Regards,
PS: Thanks to the proxmox team, you guys are awesome !
 
You can use fio to benchmark your disks. Is there a reason why you still use PVE 5? Could you please post your datacenter.cfg? Testing different speed limits might be "easier" by using the CLI for migrations because there you can usually set the respective parameters.
 
Hi Dominic,

Thanks for your answer.

Do you know how to use fio to test a LV or a VG instead of a file, or should I use fio inside a vm ?

I have legacy PVE5 and I wanted to compare with PVE6. But this issue is on both PVE6 and PVE5 with different hardware.

For the datacenter.cfg, atm :
Code:
bwlimit: clone=102400,default=102400,migration=102400,move=102400,restore=102400
keyboard: en-us

That being said, I think the issue is the zeroing and not the copying (cstream via datacenter.cfg) !

DJP