IO delays on live migration lv initialization

Helmo

Well-Known Member
Jan 11, 2018
31
3
48
Hi,

The migration itself goes just fine. But other vm's on the destination host are negatively affected. I'm seeing delays in storage response time. This leads in some instances to an unresponsive webserver. It's not completely locked up, I can ssh in and look around, but storage intensive things see more delay.

A significant detail is that it's only while the new lv is initialized.

The host has LVM thin as storage for the images. I've started noticing this with a few larger disks, 200+ GB.

After the line 'drive mirror is starting for drive-scsi0' in the migration log it stalls for a while, depending on the image size. Then I see the target disk writing at full speed, more then the 1Gbit network could do. So I assume it's zero-filling the new lv.

At this stage I just see 'qm mtunnel' in the process list on the destination host.

Once the initialization part is done it starts echoing lines like 'drive-scsi0: transferred X GiB of X GiB (X%) in Xm Xs, which respects the limit in Datacenter > Options > Bandwidth Limits > Migrations. Then the problem is over.

Could we place a similar throttle on the initialization step? Or can we skip it for LVM-thin?

In the docs, 7.12.3. Storage Features, it mentions "New volumes are automatically initialized with zero."
But doesn't lvm thin do that anyway?

Running the currently latest 7.x :
proxmox-ve: 7.4-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-4
pve-kernel-5.4: 6.4-12
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-4.13.13-5-pve: 4.13.13-38
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36+pve2
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-4
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1
 
Yes the target storage is ssd/nvme so discard should work there.

thanks for the links, I had been searching the forum for zero, but not for zeroing :(

Unchecking the discard option on the vm disks does indeed skip this initialization step during a migration. But it feels wrong to disable that. And I definetly want freed up space in the vm to be free'd in the LVM thin pool.
 
Can someone point me to the code that does this initialization?

I see the lvcreate is being done in src/PVE/Storage/LvmThinPlugin.pm alloc_image() but see no code filling it afterwards.

I do see /dev/zero only in the free_image() function which is for removing an lv.
 
Hi,
no, that's a different issue where the VM would crash.

Can someone point me to the code that does this initialization?

I see the lvcreate is being done in src/PVE/Storage/LvmThinPlugin.pm alloc_image() but see no code filling it afterwards.

I do see /dev/zero only in the free_image() function which is for removing an lv.
The zeroing happens, because of what QEMU does when the discard option is enabled: https://forum.proxmox.com/threads/v...discard-results-in-high-i-o.97647/post-423059 I'm not sure if we'd be breaking any guarantees w.r.t discard by not doing so, but I haven't had time to look at the issue in detail yet.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!