Live disk migration limit speed

check-ict · Nov 7, 2017

Hi,

We have a 6-node cluster, each connected with 2x1Gbit trunk to a 4x1Gbit trunk ZFS NFS server loaded with SSD's.

The disk IO on the ZFS server is really high, it has 12 SSD's in RAIDz2 (RAID6).

We have another ZFS SSD server, also connected with 4x1Gbit.

Now we want to migrate some disks to this new storage server with live migration. When we start the migration, all VM's on the node that initiates the move get really slow. Login on RDP takes about 5 minutes on any VM, while this normally takes several seconds. Webservers crash or respond really slow.

So when we move 1 disk of 1 VM, all VM's on that node become unusable.

How can we limit the disk move, so it will only use like 80% of the 1000Mbit link? The SSD's are almost idle when the disk moves, it's the network link.

The costs to replace all 1Gbit to 10Gbit, including a redundant managed switch, is really high. So that's not a option.

dcsapak · Nov 7, 2017

you can set a 'migrate_speed' option in the vm config
see

Code:

man qm.conf

Alwin · Nov 7, 2017

Dominiks suggestion targets the whole VM and limits on migration. For a single VM disk you need to either copy it directly from one ZFS server to the other or limit the network of the disk move process (nbd) but this involves work on each node.
https://unix.stackexchange.com/questions/28198/how-to-limit-network-bandwidth
https://linux.die.net/man/8/tc

check-ict · Nov 7, 2017

Hi,

I tried migrate_speed: 2 (MBps) however it keeps taking 100-120 MBps so my network card is stull 1000Mbit in use during the move.

pve-manager/4.4-5/c43015a5 (running kernel: 4.4.35-2-pve)
Debian 8

Alwin · Nov 7, 2017

As I said above, either limit the process or copy the disk directly from one storage server to the other.

check-ict · Nov 7, 2017

When I start the move, it only shows the following process:
task UPID

roxmox05:00002F0C:40FF4B4C:5A018D66:qmmove:159:root@pam:

How can I limit this process? I don't see any qemu-nbd or port being used.

Else I will copy it from ZFS to ZFS server, however it will require the VM to be shut down for some hours. It's about 2 TB that needs to be moved.

Alwin · Nov 7, 2017

The usual, grep & ps, as there will be also the vmid in the disk name.

check-ict · Nov 7, 2017

Yes I used ps aux, nothing there.

So moving offline this weekend, no other option it seems.

guletz · Nov 7, 2017

check-ict said:
So moving offline this weekend, no other option it seems

But you can have this: pve-zsync
You can replicate the virtual hdd to the designated node using zfs send/recive.
And pve-zsync can be used with bandwidth limit speed(as you want). After the pve-zsync is finish then on the destination node you cam move the VM config file from source node to the destination node.

This is all that you need to do.

guletz · Nov 7, 2017

check-ict said:
The costs to replace all 1Gbit to 10Gbit, including a redundant managed switch, is really high. So that's not a option.

If you look at the proper device....

You can get a layer7 switch with 16 sfp+ ports at 400 $/unit. Maybe is not so high price for you

But is very expensive for me ... maybe you are more lucky

garrettmills · Jul 29, 2021

Sorry to resurrect an old thread, but for those landing here from a search engine: in modern versions of Proxmox you can go to Datacenter > Options > Bandwidth Limits > Migrations to set a max transfer speed for migrations.

turnicus · Oct 8, 2022

garrettmills said:
Sorry to resurrect an old thread, but for those landing here from a search engine: in modern versions of Proxmox you can go to Datacenter > Options > Bandwidth Limits > Migrations to set a max transfer speed for migrations.

Thanks, this is a great improvement!

fitful · Nov 22, 2022

But using the settings in datacenter, it is impossible to limit the speed during "live migration". I just tested this and it only worked when I set "migrate_speed" directly in the vm configuration.

pve-manager/7.3-3/c3928077 (running kernel: 5.15.74-1-pve)

fiona · Nov 23, 2022

Hi,

fitful said:
But using the settings in datacenter, it is impossible to limit the speed during "live migration". I just tested this and it only worked when I set "migrate_speed" directly in the vm configuration.

pve-manager/7.3-3/c3928077 (running kernel: 5.15.74-1-pve)

thank you for reporting the issue! This is a bug, the limit from datacenter.cfg will currently only apply for disk migrations, but not when the VM state/RAM is migrated. I can reproduce it here and am working on a fix.

EDIT: Initial patch has been sent for discussion.

Search

Search

Live disk migration limit speed

check-ict

Well-Known Member

dcsapak

Proxmox Staff Member

Alwin

Proxmox Retired Staff

check-ict

Well-Known Member

Alwin

Proxmox Retired Staff

check-ict

Well-Known Member

Alwin

Proxmox Retired Staff

check-ict

Well-Known Member

guletz

Famous Member

guletz

Famous Member

garrettmills

Member

turnicus

Active Member

fitful

Active Member

fiona

Proxmox Staff Member