Hello,
since today I have a strange Problem with my proxmox installation. The storage backend used is ceph nautilus. All VMs are created from templates as full clones. It seems whenever I delete a VM-template I get this kind of error messages since today:
Until today this worked always fine. Inside of pve the template is deleted, however on ceph the template file is still there in form of a rbd with a protected snapshot:
I unprotected the snapshot, removed it and then removed the base-rbd which worked just fine. So on the ceph side everything looks good, but on the pve side I get the strange error.
This started today after I started to create a pve-snapshot of a VM and the "include memory" checkbox was on, allthough I didn't want RAM to be included. So I stopped the snapshot via web-gui. I got an error saying:
Qemu Guest Agent is not running - VM 185 qmp command 'guest-ping' failed - got timeout
snapshot create failed: starting cleanup
TASK ERROR: VM 185 qmp command 'savevm-start' failed - VM snapshot already started
Perhaps this has left a lock somewhere, but perhaps it has nothing to do with the problem.
Does anyone have an idea what might be wrong? Could a reboot help?
Here some more infos:
since today I have a strange Problem with my proxmox installation. The storage backend used is ceph nautilus. All VMs are created from templates as full clones. It seems whenever I delete a VM-template I get this kind of error messages since today:
Code:
Removing all snapshots: 0% complete...failed.
Could not remove disk 'ceph-pxa:base-174-disk-0', check manually: error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed.
Removing all snapshots: 0% complete...failed.
error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed.
TASK OK
Code:
ceph$ rbd snap ls pxa-rbd/base-174-disk-0
SNAPID NAME SIZE PROTECTED TIMESTAMP
2992 __base__ 32 GiB yes Mon Jan 18 11:59:49 2021
I unprotected the snapshot, removed it and then removed the base-rbd which worked just fine. So on the ceph side everything looks good, but on the pve side I get the strange error.
This started today after I started to create a pve-snapshot of a VM and the "include memory" checkbox was on, allthough I didn't want RAM to be included. So I stopped the snapshot via web-gui. I got an error saying:
Qemu Guest Agent is not running - VM 185 qmp command 'guest-ping' failed - got timeout
snapshot create failed: starting cleanup
TASK ERROR: VM 185 qmp command 'savevm-start' failed - VM snapshot already started
Perhaps this has left a lock somewhere, but perhaps it has nothing to do with the problem.
Does anyone have an idea what might be wrong? Could a reboot help?
Here some more infos:
Code:
root@pxsrv1: dpkg -l|grep pve
ii pve-cluster 6.2-1 amd64 "pmxcfs" distributed cluster filesystem for Proxmox Virtual Environment.
ii pve-container 3.2-2 all Proxmox VE Container management tool
ii pve-docs 6.2-6 all Proxmox VE Documentation
ii pve-edk2-firmware 2.20200531-1 all edk2 based firmware modules for virtual machines
ii pve-firewall 4.1-3 amd64 Proxmox VE Firewall
ii pve-firmware 3.1-3 all Binary firmware code for the pve-kernel
ii pve-ha-manager 3.1-1 amd64 Proxmox VE HA Manager
ii pve-i18n 2.2-1 all Internationalization support for Proxmox VE
ii pve-kernel-5.3 6.1-6 all Latest Proxmox VE Kernel Image
ii pve-kernel-5.3.10-1-pve 5.3.10-1 amd64 The Proxmox PVE Kernel Image
rc pve-kernel-5.3.18-1-pve 5.3.18-1 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.3.18-3-pve 5.3.18-3 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.4 6.2-7 all Latest Proxmox VE Kernel Image
rc pve-kernel-5.4.34-1-pve 5.4.34-2 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.4.44-2-pve 5.4.44-2 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.4.65-1-pve 5.4.65-1 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-helper 6.2-7 all Function for various kernel maintenance tasks.
ii pve-lxc-syscalld 0.9.1-1 amd64 PVE LXC syscall daemon
ii pve-manager 6.2-12 amd64 Proxmox Virtual Environment Management Tools
ii pve-qemu-kvm 5.1.0-3 amd64 Full virtualization on x86 hardware
ii pve-xtermjs 4.7.0-2 amd64 Binaries built from the Rust termproxy crate
ii smartmontools 7.1-pve2 amd64 control and monitor storage systems using S.M.A.R.T.
Code:
root@pxsrv1:/etc/pve# pvecm status
Cluster information
-------------------
Name: pxa
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Jan 18 14:02:08 2021
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1.327
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 a.b.c.d (local)
0x00000002 1 a.b.c.d
0x00000003 1 a.b.c.d
0x00000004 1 a.b.c.d
0x00000005 1 a.b.c.d
root@pxa1:/etc/pve#
root@pxa1:/etc/pve# pvecm status
Cluster information
-------------------
Name: pxa
Config Version: 5
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Jan 18 14:06:08 2021
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1.327
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 a.b.c.d
0x00000002 1 a.b.c.d
0x00000003 1 a.b.c.d
0x00000004 1 a.b.c.d
0x00000005 1 a.b.c.d
Last edited: