VM Performance Issue While Clone and Restore

mmfesucs

New Member
Mar 13, 2024
8
0
1
we are facing an issue and are not sure how to troubleshoot or find the exact reason behind it

while we attempted to clone any VM or try to restore a backup from PBS we noticed the other VMs would get affected and not function properly like hanging for sometime
already i tried to make the VM hard disk use threads but no improvements were noticed.

also, I tried to ask on other platforms and someone suggested limiting the bandwidth i noticed little improvement while cloning only, the hanging period decreased, but the same issue persisted while restoring.

someone else suggested checking the "IO Delay" and I noticed it is reaching 30-40% or more.
while cloning, the "IO Delay" will be for 3-4 minutes at the beginning of the task and hanging will be for the same amount of time, after that everything would normalize
but for restoring, the "IO Delay" will be for the entire task which takes about 20-30 minutes depending on the size and the hanging will be for the same amount of time

Are there any other suggestions on how to proceed after that to try and solve this issue
Bear in mind I'm a beginner in both Proxmox and Linux-based commands, I would appreciate some details
 
What disks are you using? Sounds like you are using SSDs that are not up for Hypervisor tasks.
 
@B.Otto
We have Proxmox installed on "HPE ProLiant DL380 Gen9" server which has SSDs compatible with that model as per our local supplier
"SSD - SATA DS - 960 GB - P18478"
 
Last edited:
You can limit bandwidth for clone/backup/restore at datacanter level, it should help
Also in pbs webui - traffic control
 
Last edited:
I have tested cloning and I noticed in the first 15% of the clone task the "IO Delay" will increase as in the screenshot

1710417230242.png

clone duration 1710417258909.png

after 15-20% the task will continue normally without the "IO Delay" Increasing
 
For backups i suggest to test with performance max-workers=1:
#cat /etc/pve/jobs.cfg
pvesh set /cluster/backup/<id> --performance max-workers=1

For clone just set writeback to underlying storage temporarily:
hwraid:
#./storcli64 /c0/v1 set wrcache=awb
#qm clone ...
#./storcli64 /c0/v1 set wrcache=wb

zfs:
#sync=disabled
#qm clone ...
#sync=standard
 
@rj45
what you mentioned above regarding performance, still have not checked
as I mentioned before I'm a beginner in that area, and I was still trying to understand the commands you provided
if you can provide more insight on the subject it would be much appreciated


there is something I noticed while testing restore, the speed is not the same as what I have set as the limit, and the "IO Delay" is high from the start till the end of the job
1710578744180.png1710578752401.png
 
@B.Otto
We have Proxmox installed on "HPE ProLiant DL380 Gen9" server which has SSDs compatible with that model as per our local supplier
"SSD - SATA DS - 960 GB - P18478"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!