Offline Migration extremely slow

luigima · Apr 25, 2019

Hey,

I noticed a huge issue. When I try to migrate a VM to a different node I get extremely slow transfer rates.
This is unexpected since I use a dedicated Gigabit network for migration (which is unused except for migrations). Unsecure flag is set aswell.

Have a look on this migration log:

Code:

task started by HA resource agent
2019-04-25 11:53:30 use dedicated network address for sending migration traffic (10.10.10.12)
2019-04-25 11:53:30 starting migration of CT 113 to node 'node2' (10.10.10.12)
2019-04-25 11:53:31 found local volume 'local-lvm:vm-113-disk-0' (in current VM config)
491520+0 records in
491520+0 records out
32212254720 bytes (32 GB, 30 GiB) copied, 7183.68 s, 4.5 MB/s
3043+3574569 records in
3043+3574569 records out
32212254720 bytes (32 GB, 30 GiB) copied, 7439.18 s, 4.3 MB/s
2019-04-25 13:57:31 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=node2' root@10.10.10.12 pvesr set-state 113 \''{}'\'
  Logical volume "vm-113-disk-0" successfully removed
2019-04-25 13:57:32 start final cleanup
2019-04-25 13:57:33 migration finished successfully (duration 02:04:03)
TASK OK

It took me 2 hours downtime to migrate the service. 4.5 MB/s...
Maybe someone can lead me to the bottleneck somehow. I do not know what can possible cause this slow performance.

Thank you very much for help!

Alwin · Apr 25, 2019

How fast can your storages write/read?

luigima · Apr 25, 2019

I am migrating from local storage to local storage (single disks). Both vary between 70-80 MB/s.

Alwin · Apr 25, 2019

luigima said:
Both vary between 70-80 MB/s.

I suppose these are spinners and that's their nominal speed. Did you test with fio or similar software (not dd) what their speed is during operation?

adrian_vg · Aug 17, 2021

Are these okay numbers?

Have no reference really, except a feeling that fifteen minutes to move a 32GB vm is too long.
I kinda' expected faster migrations...

The migration is done from a Dell R710 to a Dell R720.
Both nodes in the cluster each uses battery-backed raid5 with 6x 2 TB Seagate Constellation SAS-disks.
Network is 1 Gbps and iperf reports speeds of up to about 950ish Mbps.

Bash:

root@dragonborn:~# qm migrate 107 cyndane5
2021-08-17 19:21:13 starting migration of VM 107 to node 'cyndane5' (192.168.0.4)
2021-08-17 19:21:14 found local disk 'local-lvm:vm-107-disk-0' (in current VM config)
2021-08-17 19:21:14 copying local disk images
2021-08-17 19:21:16 volume pve/vm-107-disk-0 already exists - importing with a different name
2021-08-17 19:21:17   Logical volume "vm-107-disk-2" created.
2021-08-17 19:28:43 524288+0 records in
2021-08-17 19:28:43 524288+0 records out
2021-08-17 19:28:43 34359738368 bytes (34 GB, 32 GiB) copied, 446.933 s, 76.9 MB/s
2021-08-17 19:36:23 39+2096845 records in
2021-08-17 19:36:23 39+2096845 records out
2021-08-17 19:36:23 34359738368 bytes (34 GB, 32 GiB) copied, 905.981 s, 37.9 MB/s
2021-08-17 19:36:23 successfully imported 'local-lvm:vm-107-disk-2'
2021-08-17 19:36:23 volume 'local-lvm:vm-107-disk-0' is 'local-lvm:vm-107-disk-2' on the target
Logical volume "vm-107-disk-0" successfully removed
2021-08-17 19:36:26 migration finished successfully (duration 00:15:13)

adrian_vg · Aug 19, 2021

I'm kinda' reluctant to initiate a migration of two vm's that are over 2 TB each, because of this.
If anybody has any ideas on how to tune and streamline disk speeds with the current raid-5 arrays I'm all ears!

I realize raid-5 are not that efficient, but that's what I can afford ATM!
For a homelab setup, raid-6 is sort of cost prohibitive, and raid-1 even more so!

adrian_vg · Aug 19, 2021

More speed info.

Bash:

root@cyndane5:~# fdisk -l /dev/sda
Disk /dev/sda: 9.1 TiB, 9999220736000 bytes, 19529728000 sectors
Disk model: PERC H710     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1D160705-59BF-4481-BF42-151F7A4EF1DB

Device       Start         End     Sectors  Size Type
/dev/sda1       34        2047        2014 1007K BIOS boot
/dev/sda2     2048     1050623     1048576  512M EFI System
/dev/sda3  1050624 19529727966 19528677343  9.1T Linux LVM

Bash:

root@cyndane5:~# hdparm -tT /dev/sda

/dev/sda:
Timing cached reads:   14810 MB in  1.99 seconds = 7424.34 MB/sec
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0d 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Timing buffered disk reads: 2050 MB in  3.00 seconds = 683.04 MB/sec

The Proxmox node I'm migrating from, is slightly slower.
But I guess it doesn't impact too much.

Bash:

root@dragonborn:~# hdparm -tT /dev/sda

/dev/sda:
Timing cached reads:   10538 MB in  2.00 seconds = 5282.16 MB/sec
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0b 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Timing buffered disk reads: 1282 MB in  3.02 seconds = 424.70 MB/sec

Search

Search

Offline Migration extremely slow

luigima

New Member

Alwin

Proxmox Retired Staff

luigima

New Member

Alwin

Proxmox Retired Staff

adrian_vg

Well-Known Member

adrian_vg

Well-Known Member

adrian_vg

Well-Known Member

We value your privacy