I started a thread on the Proxmox reddit which quickly went down a rabbit hole and finally hit the bottom. https://www.reddit.com/r/Proxmox/comments/1sf3l9y/high_io_pressure_stall_during_os_install_iscsi/
TL;DR
I am experiencing high IO pressure stalls peaking at 30% along with extremely long disk formatting times while installing any OS on a VM. The VM is sitting on a LVM backed by ISCSI on a Pure X50 array with qcow2 disks format and snapshots as volume chain enabled on the LVM. I initially thought this could be a multipathing or network issue but after lots of testing it turns out having 'Discard' enabled on the qcow2 disk with snapshots as a volume-chain on the LVM causes the slow disk formatting and high IO pressure stall. Turning off 'Discard' on the disk results in normal format times and no IO pressure stall, but I'm pretty sure I want discard on for storage to be reclaimed. This same behavior does not happen on NFS backed storage (same array) with snapshots as a volume-chain enabled with qcow2 disk and 'Discard' enabled. Turning off snapshots as a volume-chain on the ISCSI backed LVM forces raw disks and with 'Discard' on there is no slow formatting or IO stalls.
Is this known or expected or is this a bug/limitation of LVM with snapshot as a volume-chain?
TL;DR
I am experiencing high IO pressure stalls peaking at 30% along with extremely long disk formatting times while installing any OS on a VM. The VM is sitting on a LVM backed by ISCSI on a Pure X50 array with qcow2 disks format and snapshots as volume chain enabled on the LVM. I initially thought this could be a multipathing or network issue but after lots of testing it turns out having 'Discard' enabled on the qcow2 disk with snapshots as a volume-chain on the LVM causes the slow disk formatting and high IO pressure stall. Turning off 'Discard' on the disk results in normal format times and no IO pressure stall, but I'm pretty sure I want discard on for storage to be reclaimed. This same behavior does not happen on NFS backed storage (same array) with snapshots as a volume-chain enabled with qcow2 disk and 'Discard' enabled. Turning off snapshots as a volume-chain on the ISCSI backed LVM forces raw disks and with 'Discard' on there is no slow formatting or IO stalls.
Is this known or expected or is this a bug/limitation of LVM with snapshot as a volume-chain?