disk move slow

micush

Renowned Member
Jul 18, 2015
77
4
73
Hi,

I've recently upgraded my network stack to 10 gig. I've added "migration_unsecure: 1" to /etc/pve/datacenter.cfg and now live migrations routinely see over 600MBps.

Writing directly from the hosts using dd to the nfs shared storage also sees 600MBps.

However, when clicking on the "move disk" button to move a disk image from one storage to another, I never go above 100MBps.

Can this be improved somehow? Everything else works as expected with the 10 gig network upgrade except for this one feature.

Regards,

micush
 
Do you have any connections between the servers running at 1Gbps?

I know there has been some improvements recently on LM to use a different network than vmbr0.

However not sure if same has been done for storage migrations, so if your default is 1Gbps your be hitting the bandwidth limit here.
 
There are no 1 gig links anywhere.

All other functions routinely exceed 500MBps.
 
are you talking about "move disk" while a VM is running or while the VM is offline? the latter uses "qemu-img convert", but the former uses qemu's "drive-mirror" functionality..
 
Offline the disk moves quite quickly.

Online it takes much longer.

As above online is a bit more than a move / copy command due to the VM being online and the current disk being read / write and changes having to be handled.
 
Offline the disk moves quite quickly.

Online it takes much longer.

less performance for an online move is to be expected - it needs to copy the old data and keep track of changes and sync those as well.
 
Hi

I also happen to run a 10gig network. Strangely VM migration is usually quick. But moving the VM disk while VM is off I only get 10Mbps speed? Is there a config I should edit or look into.

Thanks


upload_2018-9-2_20-32-8.png
 
Sorry looks like my NFS share had config issue also desibled a extrat 1gig port that was on proxmox to make sure all was going with 10gig ports now running into disk speed limit so mutch faster then 10Mbps :)
 
Offline the disk moves quite quickly.

Online it takes much longer.
Hi,
since this is your last post in your thread, I assume you figured it out somehow - I am in similar scenario, only two years later :) So my question is - have you found cure or choose different working model? Because online move from one shared storage to another makes me sick... Testing in my case is online move 80GB qcow2 disk from one shared storage to another. Weird thing is - it takes 4 minutes to write it from NVMe to EMC VNXe3200 storage, but 12 minutes to SLES Supermicro server. Rsync 80GB file to this SLES Supermicro server takes only 4 minutes. So it is not network related I guess, but some combination of proxmox <-> sles nfs server "out of tune" instruments. But what to tune? Tried different NFS mount params, different sysctl tcp mem params, no significant improvement. We want to move our VMs to SLES server in HA and keep EMC only for backups, but writes to EMC are so much faster that it feels strange move for now. So any clue would be greatly appreciated.

Thank you.
 
Just my 2 cents, move disk is still is insanely slow. 200-300mbps moving from a Synology NAS to local zfs via an 10gb connection.
And that is when no other network traffic is happening.
All other sync options, like replication and backup are up to speed - it's just the move disk thing that is stuck in the last century.
 
move disk is still is insanely slow. 200-300mbps moving from a Synology NAS to local zfs

millibits per second? Did you mean MBps=MegaByte? (Capitalization is important, for example Mbps=Megabits is used for wirespeed and so on. Yeah, nitpicking, sorry...)

Two details to check:
  • the source must be able to deliver; probably your last sentence implies that it does
  • migration may be configured to use a limit: "Datacenter --> Options --> Bandwidth Limits" - perhaps you had tested this and forgot it...?
How is you storage inside the Synology configured? What kind of devices and which filesystems? If rotating rust: with or w/o speedy metadata devices?

And of course your local zfs needs to be able to actually write with the speed you seem to expect, both bandwidth and IOPS-wise.
  • did you run "fio" with a comparable large chunk of testdata to confirm this?
  • how is the Pool organized? (Mirrors?)
  • is it "Enterprise SSDs" with PLP? A cheap consumer SSD may not be able to write with 200 MB/s for more than some handful of seconds...

Disclaimer: just guessing - you gave zero information about your devices!
 
  • Like
Reactions: news