Optimization of CPU usage during live migration.

Jun 13, 2023
8
1
3
Poland
Hello,
We have a need to migrate machines between nodes, but we need to do it quickly.
Is there a possibility of optimization?
Are there any magic parameters?
It seems that in our case the processor is the bottleneck.,When starting the migration it used full CPU, which causes us not to be able to saturate the disk and network interface (only 50%) and the migration takes relatively long. Each of the VM disks has only a single TB. The server seems to be relatively new but still we think it is not optimal. We would like the effect that both the disks and the network are fully saturated.
What can we do to make it faster?
something like RDMA ?
screeen attached
 

Attachments

  • 20240801_proxmox_slow_migration.png
    20240801_proxmox_slow_migration.png
    544.9 KB · Views: 12
high memory and cpu utilization.
Do you have deduplication on?

Code:
zfs get dedup

show yor pool configuration, source and destination
Code:
zpool status
 
high memory and cpu utilization.
Do you have deduplication on?

Code:
zfs get dedup

show yor pool configuration, source and destination
Code:
zpool status
root@s1prod:~# zfs get dedup
NAME PROPERTY VALUE SOURCE
main dedup off default
main/vm-101-disk-0 dedup off default
main/vm-101-disk-1 dedup off default
main/vm-102-disk-0 dedup off default
main/vm-104-disk-0 dedup off default
main/vm-107-disk-0 dedup off default
main/vm-107-disk-1 dedup off default
main/vm-107-disk-2 dedup off default
main/vm-109-disk-0 dedup off default
main/vm-109-disk-1 dedup off default
main/vm-109-disk-2 dedup off default
main/vm-110-disk-0 dedup off default
main/vm-110-disk-1 dedup off default
main/vm-110-disk-2 dedup off default
main/vm-110-disk-3 dedup off default
main/vm-110-disk-4 dedup off default
main/vm-110-disk-5 dedup off default
main/vm-110-disk-6 dedup off default
main/vm-110-disk-7 dedup off default
main/vm-111-disk-0 dedup off default
main/vm-111-disk-1 dedup off default
main/vm-111-disk-2 dedup off default
main/vm-111-disk-3 dedup off default
main/vm-112-disk-0 dedup off default
main/vm-112-disk-1 dedup off default
main/vm-114-disk-0 dedup off default
main/vm-120-disk-0 dedup off default
main/vm-120-disk-1 dedup off default
main/vm-120-disk-2 dedup off default
main/vm-120-disk-3 dedup off default
main/vm-120-disk-4 dedup off default
main/vm-120-disk-5 dedup off default
main/vm-122-disk-0 dedup off default
main/vm-122-disk-1 dedup off default
main/vm-122-disk-2 dedup off default
rpool dedup off default
rpool/ROOT dedup off default
rpool/ROOT/pve-1 dedup off default
rpool/data dedup off default
rpool/data/subvol-801-disk-0 dedup off default


root@s1prod:~# zpool status
pool: main
state: ONLINE
scan: scrub repaired 0B in 04:26:22 with 0 errors on Sun Jul 14 04:50:23 2024
config:

NAME STATE READ WRITE CKSUM
main ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_XXX_1-part1 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_YYY-part1 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_ZZZ_1-part1 ONLINE 0 0 0
nvme-eui.36434b305750038600200000001-part1 ONLINE 0 0 0
nvme-eui.36434b305750039700200000001-part1 ONLINE 0 0 0

errors: No known data errors

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 00:00:18 with 0 errors on Sun Jul 14 00:24:20 2024
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-eui.3636363054b0315800200000001-part3 ONLINE 0 0 0
nvme-SAMSUNG_MZ1L21T9HCLS-00A07_XXX-part3 ONLINE 0 0 0

errors: No known data errors
root@s1prod:~#
 
I do insecure option but it still start ssh - how to apply it? - we do have direct (separate network) cable connection betwen nodes

2024-08-01 22:45:38 use dedicated network address for sending migration traffic (10.255.255.252)
2024-08-01 22:45:38 starting migration of VM 101 to node 's2back' (10.255.255.252)
2024-08-01 22:45:38 found local disk 'main:vm-101-disk-1' (attached)
2024-08-01 22:45:38 starting VM 101 on remote node 's2back'
2024-08-01 22:45:41 volume 'main:vm-101-disk-1' is 'main:vm-101-disk-0' on the target
2024-08-01 22:45:41 start remote tunnel
2024-08-01 22:45:42 ssh tunnel ver 1
2024-08-01 22:45:42 starting storage migration
2024-08-01 22:45:42 scsi0: start migration to nbd:10.255.255.252:60001:exportname=drive-scsi0
drive mirror is starting for drive-scsi0
[...]
all 'mirror' jobs are ready
2024-08-01 22:46:04 starting online/live migration on tcp:10.255.255.252:60000
2024-08-01 22:46:04 set migration capabilities
2024-08-01 22:46:04 migration downtime limit: 100 ms
2024-08-01 22:46:04 migration cachesize: 512.0 MiB
2024-08-01 22:46:04 set migration parameters
2024-08-01 22:46:04 start migrate command to tcp:10.255.255.252:60000
2024-08-01 22:46:05 average migration speed: 4.9 GiB/s - downtime 88 ms
2024-08-01 22:46:05 migration status: completed
all 'mirror' jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0: mirror-job finished
2024-08-01 22:46:07 stopping NBD storage migration server on target.
2024-08-01 22:46:12 migration finished successfully (duration 00:00:34)
 
Last edited:
I do insecure option but it still start ssh - how to apply it? - we do have direct (separate network) cable connection betwen nodes

Code:
qm migrate VMxxx nodexxx --migration_type insecure


raidz1-0 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_XXX_1-part1 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_YYY-part1 ONLINE 0 0 0
nvme-SAMSUNG_MZQL27T6HBLA-00A07_ZZZ_1-part1 ONLINE 0 0 0
nvme-eui.36434b305750038600200000001-part1 ONLINE 0 0 0
nvme-eui.36434b305750039700200000001-part1 ONLINE 0 0 0
nvme-eui.36434b305750039700200000001-part1 ONLINE 0 0 0
is the same disk as SAMSUNG_MZQL27T6HBLA ?
 
Last edited:
Code:
qm migrate VMxxx nodexxx --migration_type insecure



nvme-eui.36434b305750039700200000001-part1 ONLINE 0 0 0
is the same disk as SAMSUNG_MZQL27T6HBLA ?
Thank you for the advice so far, it's better because insecure mode goes from total 10GB/s to 18GB/s
cpu is +30% - still a lot but better.
Maybe there are some more tricks :)? I would be happy reaching 25GB/s

yes, the disks are identical, but they are displyed strangely
 

Attachments

  • 20240801_proxmox_slow_migration2.png
    20240801_proxmox_slow_migration2.png
    348.7 KB · Views: 4
In my tests, it turned out that a disabled machine migrates much faster. If you can afford to disable the VM, try this method.
 
Sorry, I did another test in the same way from S2 to S1, and the CPU consumption is similar (insecure mode is enabled), the transfer is better.
the question is still active - is there more way to optimize migration?
I can not turn off VM
 

Attachments

  • 20240801_proxmox_slow_migration3.png
    20240801_proxmox_slow_migration3.png
    365.9 KB · Views: 7
Last edited:
Hi All

Just jumping onto this thread with the same issue.

Proxmox - 8.3 latest
20 GB network bandwidth available
When performing a live migration the host hist 100% CPU
Disk is crazy and this causes VM's to crash or become unresponsive

this basically means we can not migrate to or from the host without crashing other clients on the host.

this needs to be resolved as a matter of urgency if its a bug as a live migration shouldn't affect any other VM's on any source or target hosts.

@Simryc did you ever get a working solution for this issue?
or a work around?


""Cheers
G
 
Disk is crazy and this causes VM's to crash or become unresponsive
Yeah, stability is by far more important than pure performance.

this basically means we can not migrate to or from the host without crashing other clients on the host.
While I have no solution for you I want to mention a possible workaround:

You can limit the used bandwidth on purpose. This will decrease the overall system load and hopefully eliminate crashes. Look at Datacenter --> Options --> Bandwidth Limits.

Good luck!
 
  • Like
Reactions: _gabriel
Yeah, stability is by far more important than pure performance.


While I have no solution for you I want to mention a possible workaround:

You can limit the used bandwidth on purpose. This will decrease the overall system load and hopefully eliminate crashes. Look at Datacenter --> Options --> Bandwidth Limits.

Good luck!
Thanks @UdoB

That's a good suggestion it hadn't crossed my mind.

I'll look into it a little more.

My initial concern is the high cpu generated, are you suggesting that throttling the migration network will reduce cpu and disk IO?

I suspect the high disk IO is part of the problem, one of our team mentioned that they saw over 45Gb/s IO on one of the VM's before they crashed.

The vm's them selves where doing nothing it was the host disk activity being seen by the VM.

I'll have to ask them again to confirm I've understood what that witnessed correctly.

""Cheers
G
 
My initial concern is the high cpu generated, are you suggesting that throttling the migration network will reduce cpu and disk IO?
Yes, that's the idea :-)

Thought experiment: imagine throttling to a very low (unusable) bandwidth. This could not stress your system, could it?

Of course you want to test some values with a test-VM and increase it step-by-step to get an acceptable speed while staying stable. This iterative optimization process may be problematic in itself because you may accidentally go over-the-top again...
 
  • Like
Reactions: velocity08
Yes, that's the idea :-)

Thought experiment: imagine throttling to a very low (unusable) bandwidth. This could not stress your system, could it?

Of course you want to test some values with a test-VM and increase it step-by-step to get an acceptable speed while staying stable. This iterative optimization process may be problematic in itself because you may accidentally go over-the-top again...
Thanks for the methodology, I'll start with 50% of the current bandwidth and see how it go's on a test VM.

Have you tested with insecure mode ?
Even in this thread we see mixes results.

It makes sense that secure mode will have more overhead than insecure mode due to in flight encryption/ decryption.

But why would we get mixed results when testing insecure mode?

Any ideas?

""Cheers
G
 
  • Like
Reactions: velocity08
@UdoB have had a bit of a play with this today and throttled the replication tasks as well.

seems to have done the trick with the high CPU but this also slows down a migration immensely, especially when you have a VM that's 3+ TB or one with multiple 1TB drives.

I was testing replication today with a VM to see if we could do this a little better especially when the VM is large and cross platform CPU ie intel > amd.

Ive found when the replication is in sync and a live migration is performed there is a small snap taken at the end > VM differential data copied > in memory snap taken > copied > VM is stopped on source and powered up on the destination server.

this saves what could be a bit of down time and other issues, any hoo just though i would share.

thanks for the advise.

""Cheers
G
 
  • Like
Reactions: UdoB