Offline Migration extremely slow

luigima

New Member
Aug 13, 2018
7
1
3
30
Hey,

I noticed a huge issue. When I try to migrate a VM to a different node I get extremely slow transfer rates.
This is unexpected since I use a dedicated Gigabit network for migration (which is unused except for migrations). Unsecure flag is set aswell.

Have a look on this migration log:

Code:
task started by HA resource agent
2019-04-25 11:53:30 use dedicated network address for sending migration traffic (10.10.10.12)
2019-04-25 11:53:30 starting migration of CT 113 to node 'node2' (10.10.10.12)
2019-04-25 11:53:31 found local volume 'local-lvm:vm-113-disk-0' (in current VM config)
491520+0 records in
491520+0 records out
32212254720 bytes (32 GB, 30 GiB) copied, 7183.68 s, 4.5 MB/s
3043+3574569 records in
3043+3574569 records out
32212254720 bytes (32 GB, 30 GiB) copied, 7439.18 s, 4.3 MB/s
2019-04-25 13:57:31 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=node2' root@10.10.10.12 pvesr set-state 113 \''{}'\'
  Logical volume "vm-113-disk-0" successfully removed
2019-04-25 13:57:32 start final cleanup
2019-04-25 13:57:33 migration finished successfully (duration 02:04:03)
TASK OK

It took me 2 hours downtime to migrate the service. 4.5 MB/s...
Maybe someone can lead me to the bottleneck somehow. I do not know what can possible cause this slow performance.

Thank you very much for help!
 
How fast can your storages write/read?
 
I am migrating from local storage to local storage (single disks). Both vary between 70-80 MB/s.
 
Both vary between 70-80 MB/s.
I suppose these are spinners and that's their nominal speed. Did you test with fio or similar software (not dd) what their speed is during operation?
 
Are these okay numbers?

Have no reference really, except a feeling that fifteen minutes to move a 32GB vm is too long.
I kinda' expected faster migrations...

The migration is done from a Dell R710 to a Dell R720.
Both nodes in the cluster each uses battery-backed raid5 with 6x 2 TB Seagate Constellation SAS-disks.
Network is 1 Gbps and iperf reports speeds of up to about 950ish Mbps.


Bash:
root@dragonborn:~# qm migrate 107 cyndane5
2021-08-17 19:21:13 starting migration of VM 107 to node 'cyndane5' (192.168.0.4)
2021-08-17 19:21:14 found local disk 'local-lvm:vm-107-disk-0' (in current VM config)
2021-08-17 19:21:14 copying local disk images
2021-08-17 19:21:16 volume pve/vm-107-disk-0 already exists - importing with a different name
2021-08-17 19:21:17   Logical volume "vm-107-disk-2" created.
2021-08-17 19:28:43 524288+0 records in
2021-08-17 19:28:43 524288+0 records out
2021-08-17 19:28:43 34359738368 bytes (34 GB, 32 GiB) copied, 446.933 s, 76.9 MB/s
2021-08-17 19:36:23 39+2096845 records in
2021-08-17 19:36:23 39+2096845 records out
2021-08-17 19:36:23 34359738368 bytes (34 GB, 32 GiB) copied, 905.981 s, 37.9 MB/s
2021-08-17 19:36:23 successfully imported 'local-lvm:vm-107-disk-2'
2021-08-17 19:36:23 volume 'local-lvm:vm-107-disk-0' is 'local-lvm:vm-107-disk-2' on the target
Logical volume "vm-107-disk-0" successfully removed
2021-08-17 19:36:26 migration finished successfully (duration 00:15:13)
 
I'm kinda' reluctant to initiate a migration of two vm's that are over 2 TB each, because of this.
If anybody has any ideas on how to tune and streamline disk speeds with the current raid-5 arrays I'm all ears!

I realize raid-5 are not that efficient, but that's what I can afford ATM!
For a homelab setup, raid-6 is sort of cost prohibitive, and raid-1 even more so!
 
More speed info.


Bash:
root@cyndane5:~# fdisk -l /dev/sda
Disk /dev/sda: 9.1 TiB, 9999220736000 bytes, 19529728000 sectors
Disk model: PERC H710     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1D160705-59BF-4481-BF42-151F7A4EF1DB

Device       Start         End     Sectors  Size Type
/dev/sda1       34        2047        2014 1007K BIOS boot
/dev/sda2     2048     1050623     1048576  512M EFI System
/dev/sda3  1050624 19529727966 19528677343  9.1T Linux LVM


Bash:
root@cyndane5:~# hdparm -tT /dev/sda

/dev/sda:
Timing cached reads:   14810 MB in  1.99 seconds = 7424.34 MB/sec
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0d 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Timing buffered disk reads: 2050 MB in  3.00 seconds = 683.04 MB/sec




The Proxmox node I'm migrating from, is slightly slower.
But I guess it doesn't impact too much.

Bash:
root@dragonborn:~# hdparm -tT /dev/sda

/dev/sda:
Timing cached reads:   10538 MB in  2.00 seconds = 5282.16 MB/sec
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Timing buffered disk reads: 1282 MB in  3.02 seconds = 424.70 MB/sec
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!