proxmox 6.4 not cleaning correctly disk device on lvm after vm remove

Pca

New Member
Jul 1, 2021
19
1
1
I have proxmox 6.4-13
our storage for vm disks is on LVM on iSCSI on multipath
our vms are created through full clowning a vm template with proxmox api.

Then we have random problems when deleting a vm through proxmox api :
the vm is removed, the LVs of the disks are removed, but the dm devices (dmsetup ls) are still here. So if we want to define a new vm with the same vmid, proxmox complains while creating the vm since a device is here.

as a remediation, I have done a script to compare the list of dmdevice (dmsetup -ls) and the list of LVs in the storage (lvs) to build the dmsetup remove for all devices that have no LVs.

did anyone have such a behaviour with clning and removing lots of vm on lvm storage ?
 
Last edited:
is there a way to get detailled logs of what the api does when deleting a vm and its components ?
 
As an example :

1658780051579.png

LVs are gone but devices still present :

Code:
root@pmx [~]: lvs |grep 340

root@pmx [~]: dmsetup ls | grep 340
vg--live--disks--tb--1-vm--340--disk--2 (253:220)
vg--live--disks--tb--1-vm--340--disk--1 (253:218)
vg--live--disks--tb--1-vm--340--disk--0 (253:217)
vg--live--disks--tb--1-vm--340--cloudinit       (253:219)

root@pmx [~]: dmsetup status | grep 340
vg--live--disks--tb--1-vm--340--disk--2: 0 16777216 linear
vg--live--disks--tb--1-vm--340--disk--1: 0 6291456 linear
vg--live--disks--tb--1-vm--340--disk--0: 0 1048576 linear
vg--live--disks--tb--1-vm--307--disk--0: 0 73400320 linear
vg--live--disks--tb--1-vm--340--cloudinit: 0 8192 linear

root@pmx [~]: lsblk -l | grep 340
vg--live--disks--tb--1-vm--340--disk--0   253:217  0   512M  0 lvm
vg--live--disks--tb--1-vm--340--disk--0   253:217  0   512M  0 lvm
vg--live--disks--tb--1-vm--340--disk--0   253:217  0   512M  0 lvm
vg--live--disks--tb--1-vm--340--disk--0   253:217  0   512M  0 lvm
vg--live--disks--tb--1-vm--340--disk--1   253:218  0     3G  0 lvm
vg--live--disks--tb--1-vm--340--disk--1   253:218  0     3G  0 lvm
vg--live--disks--tb--1-vm--340--disk--1   253:218  0     3G  0 lvm
vg--live--disks--tb--1-vm--340--disk--1   253:218  0     3G  0 lvm
vg--live--disks--tb--1-vm--340--cloudinit 253:219  0     4M  0 lvm
vg--live--disks--tb--1-vm--340--cloudinit 253:219  0     4M  0 lvm
vg--live--disks--tb--1-vm--340--cloudinit 253:219  0     4M  0 lvm
vg--live--disks--tb--1-vm--340--cloudinit 253:219  0     4M  0 lvm
vg--live--disks--tb--1-vm--340--disk--2   253:220  0     8G  0 lvm
vg--live--disks--tb--1-vm--340--disk--2   253:220  0     8G  0 lvm
vg--live--disks--tb--1-vm--340--disk--2   253:220  0     8G  0 lvm
vg--live--disks--tb--1-vm--340--disk--2   253:220  0     8G  0 lvm


root@pmx [~]: ls -l /dev/mapper/vg--live--disks--tb--1-vm--340--*
lrwxrwxrwx 1 root root 9 Jul 24 07:48 /dev/mapper/vg--live--disks--tb--1-vm--340--cloudinit -> ../dm-219
lrwxrwxrwx 1 root root 9 Jul 24 07:48 /dev/mapper/vg--live--disks--tb--1-vm--340--disk--0 -> ../dm-217
lrwxrwxrwx 1 root root 9 Jul 24 07:48 /dev/mapper/vg--live--disks--tb--1-vm--340--disk--1 -> ../dm-218
lrwxrwxrwx 1 root root 9 Jul 24 07:48 /dev/mapper/vg--live--disks--tb--1-vm--340--disk--2 -> ../dm-220

root@pmx [~]: ls -l /dev/dm-219
brw-rw---- 1 root disk 253, 219 Jul 24 07:48 /dev/dm-219
 
I am unable to reproduce this either by GUI ou CLI ou pvesh using the sequence of events :
- clone vm from template to lvm storage
- destroy vm
the problem is triggered only using salt-cloud through a CI pipe (dm device of disks lvs not removed), very strange, seems like a timing problem ou whatever (bug in lvm2).
looking at the LVMPlugin.pm code I see that proxmox use lvremove -f instead of -y to remove the LV. maybe it miss some case where this is not good to remove the LV by using the force option.
 
Last edited: