live migration, all vms on target server extremely slow

badsmoke

Member
Nov 12, 2020
14
0
6
33
Hello

we have a 3 node cluster (EPYC), each with 10gbit and 1gbit networks. All servers have at least one ssd installed (Samsung 860 Pro).

As soon as I start a live migration to another node, the vms on the target node become very slow/not accessible at all (100% cpu usage).

I'm testing a bit with migration speed limits, with 25Mib/s it seems to work without problems, 40Mib/s seems to be too much.

What can be the reason for this?


thanks for the help

edit: even with 25Mib/s it doesn't seem to work out
 
Last edited:
you didn't really provide much details, but it sounds like the migration overloads the disks of the target node? what kind of storage are you using? are you migrating with local disks?
 
thanks for the reply

as storage I use "4TB Samsung 890 Pro SSDs", ZFS formatted.
According to the metrics they should not be at the limit, during the migration, the target ssd has about 1.6k io/s and 30mb/s write (speed limi 25Mib/s).

what do you mean by local disks?
most vms have 32gb of storage

What other information could be helpful?

Proxmox 6.4
AMD PYC 7301
190Gb Ram
4Tb Samsung 890 Pro SSD
Ethernet: 10G X550T
Ethernet: I210 Gigabit Network
 
you can monitor with zpool iostat (it also has flags for queues, latency, ..) and regular system monitoring tools. local disks means non-shared storage.
 
does not look overly loaded to me, although the 100Mib/s at a speedlimit of 11 Mib/s is not quite logical

live migration with 11 Mib/s speed limit

zpool iostat -v (targed system)
capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- ZFSssd 351G 3.38T 0 1011 0 112M sda 351G 3.38T 0 1011 0 112M ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- ZFSssd 351G 3.38T 0 1.06K 0 115M sda 351G 3.38T 0 1.06K 0 115M ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- ZFSssd 351G 3.38T 0 564 0 62.4M sda 351G 3.38T 0 564 0 62.4M ---------- ----- ----- ----- ----- ----- -----
 
Last edited:
please check the other metrics as well (zpool latency, disk latency, CPU load)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!