LVM over shared iSCSI (MSA 2050) – Deleted VMs do not free up space – Any remediation?

ugo

New Member
Nov 14, 2024
5
0
1

Infrastructure Overview​


I’m running a production cluster based on Proxmox VE 8.x with the following setup:

  • Cluster: 4 × HPE DL380 Gen10 nodes
  • Shared Storage: HPE MSA 2050 configured in Virtual Storage mode
  • A single iSCSI LUN is presented to all nodes using multipath
  • Proxmox storage is configured as LVM (non-thin) on top of this iSCSI LUN, named storage-msa
  • All VM disks are stored on this shared volume

⚠️ Issue​


When I delete a VM from the Proxmox GUI, the allocated space is not reclaimed on the MSA 2050. After investigation, I realized:

  • The LVM volume is not thin-provisioned, so discard/trim is ineffective
  • The iSCSI LUN does not receive any space reclamation commands
  • Over time, this causes the LUN to fill up permanently, even though VMs are deleted

This is starting to become a problem from a capacity planning and scalability perspective.

What I’ve tried / considered​


  • Enabling discard=on in VMs: has no effect (not relevant on LVM thick)
  • lvchange --discard passdown: not supported on non-thin LVs
  • fstrim: ineffective
  • Script to wipe orphaned volumes: works partially, but doesn’t free space on the LUN itself

❓My Question​


Is there any way to reclaim space on a shared iSCSI LUN used with LVM (non-thin), without having to destroy and recreate the volume group?

If not, what would be the best approach going forward to avoid this kind of trap?
  • Should I switch to LVM-thin (not shareable)?
  • Migrate to Ceph or ZFS over iSCSI?
  • Other recommendations?

Thanks a lot for your insights — really looking to align with best practices.
 
Hi There, I am facing exactly same problem, if this is not bug then solution needed !!

---

I tried all, ZFS , LVM or LVM-Thin over iSCSI LUN, with discard option on LVM-Thin, no working,

ZFS didnt work,

LVM-Thin with discard option on LVM-Thin, didnt work,

only when LVM with "Wipe Removed Volumns" on, will reflect to LUN side after long long time wiping.....

dispite

```
pvesm set <LVM> --saferemove_throughput 1048576000
```
can improve speed but still slow if LVM is large !!!

and furthermore, if a VM created more than 1 VM Disks on this LVM,

on VM remove, it create more than 1 wiping task wipe the same LUN the same time, it eventually causing crash on LUN!!!!

.
.
.
33533382656 B 31.2 GB 523.9 s (8:43 min) 64001764 B/s 61.04 MB/s
zero out finished (note: 'No space left on device' is ok here): write: No space left on device
Volume group "pve1vm22x-lp" not found <------------ HERE
TASK ERROR: lvremove 'pve1vm22x-lp/del-vm-22000-cloudinit' error: Cannot process volume group pve1vm22x-lp

HERE: LUN source already disappeared at this point,

i tested , one by one VM Disk removing , still fine, VM-> Remove = wiping all Disks the same time, 100% crash !

---

If this is not bug then solution needed !!

If there is requirement on LUN source discard can pass down, i hope Proxmox specify it, size unequal on 2 side is not acceptable......

Yet, this should be simple and common practice, why only few people try to debug it, people dont care ???
 
Use just file storage instead of block storage and all described problems here are gone. Solution could be so easy but nobody like to care off ...
:)
 
ZFS didnt work,

LVM-Thin with discard option on LVM-Thin, didnt work,

Correct. those are not supported options.

If this is not bug then solution needed !!
Its not a bug. PVE doesnt have a mechanism to "talk" to your storage device. the only way to make sure this doesn't happen is to NOT thin provision your luns on the storage.
 
  • Like
Reactions: Johannes S
Use just file storage instead of block storage and all described problems here are gone. Solution could be so easy but nobody like to care off ...
:)
Ummm, this is a little off from the reality what i know......

via SMB or NFS wont has latency and other problems ??? on deployment, as I know that using block device is solid...........
 
  • Like
Reactions: Johannes S
Correct. those are not supported options.


Its not a bug. PVE doesnt have a mechanism to "talk" to your storage device. the only way to make sure this doesn't happen is to NOT thin provision your luns on the storage.

LOL yes... if LUN is not tihn, yes, i dont have to care the size then...
 
Ummm, this is a little off from the reality what i know......

via SMB or NFS wont has latency and other problems ??? on deployment, as I know that using block device is solid...........
Yeah, my latency is so high that I need 1 full second for
3533382656 B 31.2 GB 523.9 s (8:43 min)
It's just 524x as fast ... that kill's the cat and it's to far from reality of others with this block layer overhead
:)
 
Yeah, my latency is so high that I need 1 full second for

It's just 524x as fast ... that kill's the cat and it's to far from reality of others with this block layer overhead
:)

where did you get 524x as fast from ??? the wipping speed is limited by PVE and set to 60MB/s on purpose ?? you ok ? so you didnt even read my post right ??

I am saying that your approach via NFS or SMB as VM disk is off from the reality, get it ?? You can go for it still.
 
Last edited:
[root@pvefs images]# l -h 187/vm-187-disk-0.raw
-rw-r--r-- 1 root root 32G Oct 25 17:52 187/vm-187-disk-0.raw
[root@pvefs images]# time rm -rf 187/vm-187-disk-0.raw

real 0m0.348s
user 0m0.000s
sys 0m0.345s

Uups, a full 1s for removing a 32G vm disk was just poorly estimated on a 10 hdd raid6.
 
And with a fs it makes no difference if on hdd or ssd/nvme raid array as you need no trim or discard to get the 32G space immedently free :

root@pve1:/mnt/pve/pvefs_data/images# df .
Filesystem 1K-blocks Used Available Use% Mounted on
172.16.60.6:/srv/data 31247910912 4828304384 26419606528 16% /mnt/pve/pvefs_data

root@pve1:/mnt/pve/pvefs_data/images# rm -f 187/vm-187-disk-0.raw

root@pve1:/mnt/pve/sim06_data/images# df .
Filesystem 1K-blocks Used Available Use% Mounted on
172.16.60.6:/srv/data 31247910912 4794749952 26453160960 16% /mnt/pve/pvefs_data
 
Aah yes and don't to forget the time consuming restore of the 32G image again so my colleagues don't miss it :

[root@pvefs 187]# time cp /srv/.xfssnaps/daily.2025-10-25_0001/srv/data/images/187/vm-187-disk-0.raw .

real 0m0.015s
user 0m0.000s
sys 0m0.002s

[root@pvefs 187]# ll
total 441736
-rw-r----- 1 root root 34359738368 Oct 25 18:37 vm-187-disk-0.raw
 
First, i have not idea where you get superioty from, Kid.
---
I am now talking the zero-wipe behavior over block device,
and you are talking file management level behavior over NFS.
---
I am talking about the real world deployment,
and you are talking lab in school.

I am talking about the real use case for what iSCSCI designed for, and the problem i found from its management,
You are talking about how my school lab is fancy.

Funny isn't it.
---
If you managing in file based, then, anything not instant ?

if NFS works for all why iSCSCI still developed ??

Run a heavy usage DB over NFS, go for it.

Aah yes and don't to forget the time consuming restore of the 32G image again so my colleagues don't miss it :

[root@pvefs 187]# time cp /srv/.xfssnaps/daily.2025-10-25_0001/srv/data/images/187/vm-187-disk-0.raw .

real 0m0.015s
user 0m0.000s
sys 0m0.002s

[root@pvefs 187]# ll
total 441736
-rw-r----- 1 root root 34359738368 Oct 25 18:37 vm-187-disk-0.raw
 
Last edited:
After around 3 decades in IT of industrie I just can say forgive those who don't know.
 
Last edited:
Hello, please take a look at the following bug report [1].

Therein it is explained that there is a bug in the MSA 2050 firmware: the device reports (via the LBPRZ bit) that after discarding a block reading from it again will return zeroes, but in practice this does not happen. This can result in data corruption. I am not entire sure if you are affected by the same issue, but I would recommend to check the bug report nevertheless and apply the workaround described in comment 17 to prevent potential data corruption in the future. Additionally, please check if there are firmware updates for the MSA 2050.

[1] https://bugzilla.proxmox.com/show_bug.cgi?id=5754
 
Last edited:
After around 3 decades in IT of industrie I just can say forgive those who don't know.
God no one is here to argue with you what you to believe, my question and this post owner is reprint issue over iSCSI. No one is asking your recommendation from whatever you believe, is that age making you stubborn?
 
Last edited: