Zeroing of data prior to 'migrate'

Tozz · Jun 12, 2020

We have a Proxmox VE cluster with 3 nodes, all using local storage. Using local storage is a choice we made due to some issues in the past with different storage solutions.

When we migrate a machine from node A to node B we are seeing IO-issues occuring on Node B. It's SSD drives are saturated causing other VMs to hang, give timeouts, etc. During this high IO wait there is no network traffic from the migrator, so it looks like the disks are being zeroed? We can see with 'lvs -a' that the newly created disk is beeing filled (Data% percentage increases). After data percentage is 100% we can see the actual migration start in the Proxmox WebUI:

Code:

2020-06-11 20:21:53 use dedicated network address for sending migration traffic (192.168.0.1)
2020-06-11 20:21:53 starting migration of VM 172 to node 'nlgrq1pm-p001' (192.168.0.1)
2020-06-11 20:21:54 found local disk 'thindata2:vm-172-disk-0' (in current VM config)
2020-06-11 20:21:54 copying local disk images
2020-06-11 20:21:54 starting VM 172 on remote node 'nlgrq1pm-p001'
2020-06-11 20:21:57 start remote tunnel
2020-06-11 20:21:58 ssh tunnel ver 1
2020-06-11 20:21:58 starting storage migration
2020-06-11 20:21:58 scsi0: start migration to nbd:unix:/run/qemu-server/172_nbd.migrate:exportname=drive-scsi0
drive mirror is starting for drive-scsi0 with bandwidth limit: 153600 KB/s

After the line 'drive mirror is starting for ...', the (I assume) zeroing begins. lvs -a output for the LV increases till it reaches 100%. During this process 'iotop' shows about 800 MB/s of IO traffic, while network traffic is very low (eg. kilobytes/s)

After 'lvs -a' shows the device is filled 100% the process continues:

Code:

drive-scsi0: transferred: 50331648 bytes remaining: 64374177792 bytes total: 64424509440 bytes progression: 0.08 % busy: 0 ready: 0
drive-scsi0: transferred: 201326592 bytes remaining: 64223182848 bytes total: 64424509440 bytes progression: 0.31 % busy: 0 ready: 0

And from here on it continues normally. Because it has to transfer over network the IO load is lower, and machines that experienced high IO-wait times resume their work.

A couple of questions:
- The zeroing of devices doesn't seem to occur when moving disks (eg. from storage0 to storage1 on the same node). Why the difference?
- Can we prevent the (I assume) zeroing of devices? It causes timeouts

Tozz · Jun 18, 2020

Nobody that can help me? I've tried to disable zeroing on the LVM thin with "lvchange -Zn vg/lv", but that doesn't resolve the issue.
Could it be that it's the Proxmox tooling that is zeroing the drives before it actually starts the data transfer?

decaen · Oct 7, 2021

Could you try to disable "discard" option in your VM hard disk ?

Tozz · Jul 6, 2022

decaen said:
Could you try to disable "discard" option in your VM hard disk ?

Thank you. I missed your reply, but this indeed solves the issue! Thank you very much.

Search

Search

Zeroing of data prior to 'migrate'

Tozz

Active Member

Tozz

Active Member

decaen

Member

Tozz

Active Member