Hello.
HA migration of VM between two nodes was really slow (speed: 9.19 MB/s).
Total ram of VM is 8 gb.
Storage is shared (Ceph external).
Network speed between nodes is 10 Gbits/sec.
Usually migration speed is something like in this example: (1365.33 MB/s - downtime 90 ms).
Can you please explain what is caused that huge speed drop?
Nodes, storage, network devices are not overloaded.
datacenter.cfg:
migration: type=insecure,network=10.10.36.0/23
HA migration of VM between two nodes was really slow (speed: 9.19 MB/s).
Total ram of VM is 8 gb.
Storage is shared (Ceph external).
Network speed between nodes is 10 Gbits/sec.
Usually migration speed is something like in this example: (1365.33 MB/s - downtime 90 ms).
Can you please explain what is caused that huge speed drop?
Nodes, storage, network devices are not overloaded.
datacenter.cfg:
migration: type=insecure,network=10.10.36.0/23
task started by HA resource agent
2019-04-03 16:10:12 use dedicated network address for sending migration traffic (10.10.36.44)
2019-04-03 16:10:12 starting migration of VM 139 to node 'prox4' (10.10.36.44)
2019-04-03 16:10:12 copying disk images
2019-04-03 16:10:12 starting VM 139 on remote node 'prox4'
2019-04-03 16:10:14 start remote tunnel
2019-04-03 16:10:14 ssh tunnel ver 1
2019-04-03 16:10:14 starting online/live migration on tcp:10.10.36.44:60000
2019-04-03 16:10:14 migrate_set_speed: 8589934592
2019-04-03 16:10:14 migrate_set_downtime: 0.1
2019-04-03 16:10:14 set migration_caps
2019-04-03 16:10:14 set cachesize: 1073741824
2019-04-03 16:10:14 start migrate command to tcp:10.10.36.44:60000
2019-04-03 16:10:15 migration status: active (transferred 7317239, remaining 8598532096), total 8607571968)
2019-04-03 16:10:15 migration xbzrle cachesize: 1073741824 transferred 0 pages 0 cachemiss 0 overflow 0
2019-04-03 16:10:16 migration status: active (transferred 24430966, remaining 8579624960), total 8607571968)
2019-04-03 16:10:16 migration xbzrle cachesize: 1073741824 transferred 0 pages 0 cachemiss 0 overflow 0
2019-04-03 16:10:17 migration status: active (transferred 41115421, remaining 8562257920), total 8607571968)
...
2019-04-03 16:25:05 migration speed: 9.19 MB/s - downtime 414 ms
2019-04-03 16:25:05 migration status: completed
2019-04-03 16:25:08 migration finished successfully (duration 00:14:56)
TASK OK
2019-04-03 16:10:12 use dedicated network address for sending migration traffic (10.10.36.44)
2019-04-03 16:10:12 starting migration of VM 139 to node 'prox4' (10.10.36.44)
2019-04-03 16:10:12 copying disk images
2019-04-03 16:10:12 starting VM 139 on remote node 'prox4'
2019-04-03 16:10:14 start remote tunnel
2019-04-03 16:10:14 ssh tunnel ver 1
2019-04-03 16:10:14 starting online/live migration on tcp:10.10.36.44:60000
2019-04-03 16:10:14 migrate_set_speed: 8589934592
2019-04-03 16:10:14 migrate_set_downtime: 0.1
2019-04-03 16:10:14 set migration_caps
2019-04-03 16:10:14 set cachesize: 1073741824
2019-04-03 16:10:14 start migrate command to tcp:10.10.36.44:60000
2019-04-03 16:10:15 migration status: active (transferred 7317239, remaining 8598532096), total 8607571968)
2019-04-03 16:10:15 migration xbzrle cachesize: 1073741824 transferred 0 pages 0 cachemiss 0 overflow 0
2019-04-03 16:10:16 migration status: active (transferred 24430966, remaining 8579624960), total 8607571968)
2019-04-03 16:10:16 migration xbzrle cachesize: 1073741824 transferred 0 pages 0 cachemiss 0 overflow 0
2019-04-03 16:10:17 migration status: active (transferred 41115421, remaining 8562257920), total 8607571968)
...
2019-04-03 16:25:05 migration speed: 9.19 MB/s - downtime 414 ms
2019-04-03 16:25:05 migration status: completed
2019-04-03 16:25:08 migration finished successfully (duration 00:14:56)
TASK OK
root@prox2:~# iperf -c 10.10.36.44
------------------------------------------------------------
Client connecting to 10.10.36.44, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 10.10.36.42 port 37470 connected with 10.10.36.44 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.5 GBytes 9.89 Gbits/sec
root@prox4:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.10.36.44 port 5001 connected with 10.10.36.42 port 37470
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 11.5 GBytes 9.89 Gbits/sec
------------------------------------------------------------
Client connecting to 10.10.36.44, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 10.10.36.42 port 37470 connected with 10.10.36.44 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 11.5 GBytes 9.89 Gbits/sec
root@prox4:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.10.36.44 port 5001 connected with 10.10.36.42 port 37470
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 11.5 GBytes 9.89 Gbits/sec
proxmox-ve: 5.3-1 (running kernel: 4.15.18-9-pve)
pve-manager: 5.3-6 (running version: 5.3-6/37b3c8df)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-43
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-34
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-5
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-31
pve-container: 2.0-31
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-16
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-43
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1
pve-manager: 5.3-6 (running version: 5.3-6/37b3c8df)
pve-kernel-4.15: 5.2-12
pve-kernel-4.15.18-9-pve: 4.15.18-30
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-3
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-43
libpve-guest-common-perl: 2.0-18
libpve-http-server-perl: 2.0-11
libpve-storage-perl: 5.0-34
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.0.2+pve1-5
lxcfs: 3.0.2-2
novnc-pve: 1.0.0-2
proxmox-widget-toolkit: 1.0-22
pve-cluster: 5.0-31
pve-container: 2.0-31
pve-docs: 5.3-1
pve-edk2-firmware: 1.20181023-1
pve-firewall: 3.0-16
pve-firmware: 2.0-6
pve-ha-manager: 2.0-5
pve-i18n: 1.0-9
pve-libspice-server1: 0.14.1-1
pve-qemu-kvm: 2.12.1-1
pve-xtermjs: 1.0-5
qemu-server: 5.0-43
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.12-pve1~bpo1