removing a template: error during cfs-locked 'storage-ceph' operation: rbd snap purge

rainer042

Well-Known Member
Dec 3, 2019
37
3
48
124
Hello,
since today I have a strange Problem with my proxmox installation. The storage backend used is ceph nautilus. All VMs are created from templates as full clones. It seems whenever I delete a VM-template I get this kind of error messages since today:

Code:
Removing all snapshots: 0% complete...failed.
Could not remove disk 'ceph-pxa:base-174-disk-0', check manually: error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed.
Removing all snapshots: 0% complete...failed.
error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed.
TASK OK
Until today this worked always fine. Inside of pve the template is deleted, however on ceph the template file is still there in form of a rbd with a protected snapshot:
Code:
ceph$ rbd snap ls  pxa-rbd/base-174-disk-0
SNAPID NAME     SIZE   PROTECTED TIMESTAMP              
  2992 __base__ 32 GiB yes       Mon Jan 18 11:59:49 2021

I unprotected the snapshot, removed it and then removed the base-rbd which worked just fine. So on the ceph side everything looks good, but on the pve side I get the strange error.

This started today after I started to create a pve-snapshot of a VM and the "include memory" checkbox was on, allthough I didn't want RAM to be included. So I stopped the snapshot via web-gui. I got an error saying:

Qemu Guest Agent is not running - VM 185 qmp command 'guest-ping' failed - got timeout
snapshot create failed: starting cleanup
TASK ERROR: VM 185 qmp command 'savevm-start' failed - VM snapshot already started

Perhaps this has left a lock somewhere, but perhaps it has nothing to do with the problem.


Does anyone have an idea what might be wrong? Could a reboot help?

Here some more infos:
Code:
root@pxsrv1: dpkg -l|grep pve
ii  pve-cluster                          6.2-1                        amd64        "pmxcfs" distributed cluster filesystem for Proxmox Virtual Environment.
ii  pve-container                        3.2-2                        all          Proxmox VE Container management tool
ii  pve-docs                             6.2-6                        all          Proxmox VE Documentation
ii  pve-edk2-firmware                    2.20200531-1                 all          edk2 based firmware modules for virtual machines
ii  pve-firewall                         4.1-3                        amd64        Proxmox VE Firewall
ii  pve-firmware                         3.1-3                        all          Binary firmware code for the pve-kernel
ii  pve-ha-manager                       3.1-1                        amd64        Proxmox VE HA Manager
ii  pve-i18n                             2.2-1                        all          Internationalization support for Proxmox VE
ii  pve-kernel-5.3                       6.1-6                        all          Latest Proxmox VE Kernel Image
ii  pve-kernel-5.3.10-1-pve              5.3.10-1                     amd64        The Proxmox PVE Kernel Image
rc  pve-kernel-5.3.18-1-pve              5.3.18-1                     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.3.18-3-pve              5.3.18-3                     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.4                       6.2-7                        all          Latest Proxmox VE Kernel Image
rc  pve-kernel-5.4.34-1-pve              5.4.34-2                     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.4.44-2-pve              5.4.44-2                     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-5.4.65-1-pve              5.4.65-1                     amd64        The Proxmox PVE Kernel Image
ii  pve-kernel-helper                    6.2-7                        all          Function for various kernel maintenance tasks.
ii  pve-lxc-syscalld                     0.9.1-1                      amd64        PVE LXC syscall daemon
ii  pve-manager                          6.2-12                       amd64        Proxmox Virtual Environment Management Tools
ii  pve-qemu-kvm                         5.1.0-3                      amd64        Full virtualization on x86 hardware
ii  pve-xtermjs                          4.7.0-2                      amd64        Binaries built from the Rust termproxy crate
ii  smartmontools                        7.1-pve2                     amd64        control and monitor storage systems using S.M.A.R.T.


Code:
root@pxsrv1:/etc/pve#  pvecm status
Cluster information
-------------------
Name:             pxa
Config Version:   5
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon Jan 18 14:02:08 2021
Quorum provider:  corosync_votequorum
Nodes:            5
Node ID:          0x00000001
Ring ID:          1.327
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 a.b.c.d (local)
0x00000002          1 a.b.c.d
0x00000003          1 a.b.c.d
0x00000004          1 a.b.c.d
0x00000005          1 a.b.c.d
root@pxa1:/etc/pve#
root@pxa1:/etc/pve#  pvecm status
Cluster information
-------------------
Name:             pxa
Config Version:   5
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon Jan 18 14:06:08 2021
Quorum provider:  corosync_votequorum
Nodes:            5
Node ID:          0x00000001
Ring ID:          1.327
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   5
Highest expected: 5
Total votes:      5
Quorum:           3
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 a.b.c.d
0x00000002          1 a.b.c.d
0x00000003          1 a.b.c.d
0x00000004          1 a.b.c.d
0x00000005          1 a.b.c.d
 
Last edited: