Problem about Migration Speed Lower than NIC10GB & SSD Raid0

chaiyuttochai

Member
Apr 13, 2017
21
0
6
38
Thailand
tomato-ideas.com
Hi, currently i have problem about migration & move (LXC, VM) from node to other node is a really bad speed.


My physical server is DELL R630 with Raid10 (H730mini) and Harddisk is DELL SAS SSD 1.9TB x 6Drive ( Each drive speed is 1.2Gb read / 800Mb write)
I think raid10 with 6 Drives should have speed read/write not less than 2GB/sec.
My Server also attach NIC 10GBit Card from Intel. (SFP) which speed shouldn't less than 1GByte/sec (private network separate from public network )
I already check my Fiber network between node with iPerf. The result is 980 MByte - 1,150 Mbyte (average 10Gbit/sec)

BUT !!!
When i migrate LXC from node to other node with has same hardware. It take time too long (Size about 32GB it's take about 40 minute )
I check proxmox network graph is max at 250-300MB only. Why? and How i can tune or config to get a better performance.

Here when i check with pveperf

Code:
root@r02:~# pveperf
CPU BOGOMIPS:      211246.80
REGEX/SECOND:      2892091
HD SIZE:           93.99 GB (/dev/mapper/pve-root)
BUFFERED READS:    1229.87 MB/sec
AVERAGE SEEK TIME: 0.12 ms
FSYNCS/SECOND:     4530.97
DNS EXT:           193.08 ms
DNS INT:           28.77 ms (local)


When i check with dd if

Code:
root@r02:~# dd if=/dev/zero of=/tmp/test.img bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.97299 s, 544 MB/s
root@r02:~# dd if=/dev/zero of=/tmp/test.img bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.54228 s, 696 MB/s
root@r02:~#

I also check datacenter.cfg

Code:
root@r02:~# cat /etc/pve/datacenter.cfg
keyboard: en-us
migration_unsecure: 1


Any suggestion? Please
 
I think raid10 with 6 Drives should have speed read/write not less than 2GB/sec.

Depends on the protocol version, if're using 6 GBit SAS, the theoretical limit of the bus is 600 MB/sec, with 3 drives you will have a theoretical limit of 1800 MB/sec, so the theorecial write limit is already lower than your expections, the real world is even worse.

Why? and How i can tune or config to get a better performance.

The data is send encrypted over the network via SSH, so the limit is the single core encryption power of your CPU. You can try lowering the encryption work with a less secure cipher.
 
Depends on the protocol version, if're using 6 GBit SAS, the theoretical limit of the bus is 600 MB/sec, with 3 drives you will have a theoretical limit of 1800 MB/sec, so the theorecial write limit is already lower than your expections, the real world is even worse.



The data is send encrypted over the network via SSH, so the limit is the single core encryption power of your CPU. You can try lowering the encryption work with a less secure cipher.

My SSD SAS is support 12Gb And my RaidController also support 12Gb
I think it's should not slower than 1GByte for sure. but why network max average at 300Mb ..

My expectation speed. If i move LXC size 32GB to other node, It should take less than 3 minutes
(1GB per sec --> actually it must finish with in 1 minute in theory bcos 60sec = 60GB already.)

For the config solution with " encrypted over the network via SSH "
which option i should config ? I have check on my config on this path ( ~/.ssh/config).
It has this config already thus i no need to config in /etc/ssh/sshd_config , am i right?



Code:
root@r10:~# cat ~/.ssh/config
Ciphers aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,chacha20-poly1305@openssh.com


Any other option to config ?
 
Last edited:
My SSD SAS is support 12Gb And my RaidController also support 12Gb
I think it's should not slower than 1GByte for sure. but why network max average at 300Mb ..

Okay 12G SAS is capable of what you want.

My expectation speed. If i move LXC size 32GB to other node, It should take less than 3 minutes
(1GB per sec --> actually it must finish with in 1 minute in theory bcos 60sec = 60GB already.)


Not with SSH, the limit is the speed of one CPU core.

Any other option to config ?

Can try to find the best algo for your CPU, just do a benchmark. Please post how fast your CPU is.
 
Here is the result speed
It's max at 300MB only o_O
Code:
Proxmox
Virtual Environment 6.2-6
Search
Virtual Machine 108 (ubuntu1804-template) on node 'r10'
Server View
Logs
()
deprecated setting 'migration_unsecure' and new 'migration: type' set at same time! Ignore 'migration_unsecure'
2020-07-05 20:48:44 use dedicated network address for sending migration traffic (10.0.0.107)
2020-07-05 20:48:44 starting migration of VM 108 to node 'r10' (10.0.0.107)
2020-07-05 20:48:45 found local disk 'local-lvm:vm-108-disk-0' (in current VM config)
2020-07-05 20:48:45 copying local disk images
2020-07-05 20:48:47   Logical volume "vm-108-disk-0" created.
2020-07-05 20:51:49 1048576+0 records in
2020-07-05 20:51:49 1048576+0 records out
2020-07-05 20:51:49 68719476736 bytes (69 GB, 64 GiB) copied, 182.571 s, 376 MB/s
2020-07-05 20:51:50 122+4189738 records in
2020-07-05 20:51:50 122+4189738 records out
2020-07-05 20:51:50 68719476736 bytes (69 GB, 64 GiB) copied, 182.763 s, 376 MB/s
2020-07-05 20:51:50 successfully imported 'local-lvm:vm-108-disk-0'
2020-07-05 20:51:50 volume 'local-lvm:vm-108-disk-0' is 'local-lvm:vm-108-disk-0' on the target
  Logical volume "vm-108-disk-0" successfully removed
2020-07-05 20:51:51 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=r10' root@10.0.0.107 pvesr set-state 108 \''{}'\'
2020-07-05 20:51:52 migration finished successfully (duration 00:03:08)
TASK OK
 
I found one solution that help more faster a little-bit in /etc/pve/datacenter.cfg
I found the error log about deprecated config "migration_unsecure"
Then i change config from using. "migration_unsecure: 1" into "migration: insecure,network=x.x.x.x/24"
Speed Result is better than before but it's MAX at 600MB/sec

Any one have other way suggestion to get a better speed than this?
 
I found the error log about deprecated config "migration_unsecure"
Then i change config from using. "migration_unsecure: 1" into "migration: insecure,network=x.x.x.x/24"
Speed Result is better than before but it's MAX at 600MB/sec

Oh yes, I knew there was another option but I forgot.

Any one have other way suggestion to get a better speed than this?

That is the ballpark speed your dd showed, isn't it? So that should be the next bottleneck to work on.

First node --> CPU(s) 56 x Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz (2 Sockets)
Second node --> CPU(s) 48 x Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (2 Sockets)

That is what I thought. Very slow processors and the performance bottleneck for your SSH/SCP, because we are still in single threaded mode, so only one core can work at a time. The turbo clock speed on those core-monsters is very low. You would get a much better SSH/SCP throughput with a much fewer core, but 3.5+ GHz model.
 
So even in insecure mode, the migration takes place using the ssh protocol and in single-threaded mode?
Hmm, if so, are there no other faster protocols that could be used to perform the migration process using multithreading and thus the full bandwidth used for migration.

I have a similar situation with the speed of the migration process and from what I read there are many such threads on the forum.
 
Also don't mix up available bandwidth with real world performance.
Theoretical performance is often 5-10x higher than real world performance. Especially on SSDs you can have significant drops if comparing peak (what's in the datasheet) vs. sustained performance.
If your SSDs are well filled or have entered the "overwrite" situation you think you issue a write, but what happens is a read-modify-write which slows things down.
According to my experience RAID 10 on SSDs is typically a lot slower than x-times the SSD itself, due to striping overhead...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!