Infrastructure Overview
I’m running a production cluster based on Proxmox VE 8.x with the following setup:
- Cluster: 4 × HPE DL380 Gen10 nodes
- Shared Storage: HPE MSA 2050 configured in Virtual Storage mode
- A single iSCSI LUN is presented to all nodes using multipath
- Proxmox storage is configured as LVM (non-thin) on top of this iSCSI LUN, named storage-msa
- All VM disks are stored on this shared volume
Issue
When I delete a VM from the Proxmox GUI, the allocated space is not reclaimed on the MSA 2050. After investigation, I realized:
- The LVM volume is not thin-provisioned, so discard/trim is ineffective
- The iSCSI LUN does not receive any space reclamation commands
- Over time, this causes the LUN to fill up permanently, even though VMs are deleted
This is starting to become a problem from a capacity planning and scalability perspective.
What I’ve tried / considered
- Enabling discard=on in VMs: has no effect (not relevant on LVM thick)
- lvchange --discard passdown: not supported on non-thin LVs
- fstrim: ineffective
- Script to wipe orphaned volumes: works partially, but doesn’t free space on the LUN itself
My Question
Is there any way to reclaim space on a shared iSCSI LUN used with LVM (non-thin), without having to destroy and recreate the volume group?
If not, what would be the best approach going forward to avoid this kind of trap?
- Should I switch to LVM-thin (not shareable)?
- Migrate to Ceph or ZFS over iSCSI?
- Other recommendations?
Thanks a lot for your insights — really looking to align with best practices.