random VM ends up in paused state after "Bulk migrate"

that looks like a different issue (likely overload of the target node - do you have monitoring in place and if so, what does it say?)
 
that looks like a different issue (likely overload of the target node - do you have monitoring in place and if so, what does it say?)

The nodes im bulk-migrating to are empty (no vm, container running on target node). While bulkmigrating cpu cores not really in use and mem is only 30GB/126GB used. But load goes up to 5.56. Could that be the problem? shouldn't migrations still be without errors even though load goes high?

Edit: seems like your right, load does barely go over 1.0 on paralleljobs=1 AND no vm that needs to be resumed. Any way to prevent user from doing a bulk migrate with paralleljobs via UI-information?

1646743178879.png
 
Last edited:
no, if a user can migrate they can also start several manual migrates in parallel so there is no sense to limit the bulk migrate feature. I'd look at storage tuning next, possibly the VMs booting causes too much I/O load? what kind of storage are you using? can you try whether enabling I/O threads (requires using scsi disks with virtio-scsi-single as controller!) helps?