ceph locked rbd after proxmox node crash

Rico29

Active Member
Jan 10, 2018
28
0
41
38
Hello,
I'm using a 3 node proxmox cluster (5.3.9), connected to a remote ceph cluster via a dedicated 10G network.

Everything work fine, it's very reliable, but a problem occurs when a proxmox node crashes.

proxmox's HA moves the VMs from the node that crashes to other nodes, and start VMs. VMs are seen as started by proxmox, but rbd images are still locked on ceph side by the crashed node. So the VM stays in a state where it's disk is locked by another process, and fails to boot correctly.

Example :
proxmox1 : 192.168.171.61/24 (172.18.7.61 on ceph side)
proxmox2 : 192.168.171.62/24(172.18.7.62 on ceph side)
proxmox3 : 192.168.171.63/24(172.18.7.63 on ceph side)
ceph1 : 172.18.7.51/24
ceph2 : 172.18.7.52/24
ceph3 : 172.18.7.53/24

VM 201 is running on proxmox1. On ceph side, I see that the rbd is locked by proxmox1 address:
Code:
root@ceph-am7-1:~# rbd lock ls --pool c7000-pxmx1-am7 vm-201-disk-0
There is 1 exclusive lock on this image.
Locker       ID                   Address              
client.70464 auto 140450841490944 172.18.7.61:0/2839087142



proxmox1 crashes. proxmox's HA moves the VM to another node (proxmox3 in this case).
on ceph side :
Code:
root@ceph-am7-1:~# rbd lock ls --pool c7000-pxmx1-am7 vm-201-disk-0
There is 1 exclusive lock on this image.
Locker       ID                   Address              
client.70464 auto 140450841490944 172.18.7.61:0/2839087142
The lock from the dead proxmox node is still present.

On the VM console, I see :

vm201.png
the only way to make the VM work again is to unlock it on ceph side :

Code:
root@ceph-am7-1:~# rbd lock remove --pool c7000-pxmx1-am7 vm-201-disk-0 "auto 140450841490944" client.70464

As soon as I unlock the rbd, fsck works and VM starts.

what would be the way to resolve this issue ? as I still have quorum, can fencing send commands to ceph cluster to unlock the RBDs ?

Regards,
Cédric
 

Attachments

  • upload_2019-2-22_8-6-47.png
    upload_2019-2-22_8-6-47.png
    54.4 KB · Views: 9
Last edited:
May you please post the ceph.conf of the Proxmox nodes? How is the storage setup?
 
May you please post the ceph.conf of the Proxmox nodes? How is the storage setup?

Hi Alwin,
ceph storage is not on the proxmox node, so there is not ceph.conf on the proxmox nodes.

storage is defined in /etc/pve/storage.cfg, which looks like this :
Code:
rbd: ceph_pxmx1
        content images
        krbd 0
        monhost 172.18.7.51 172.18.7.52 172.18.7.53
        pool c7000-pxmx1-am7
        username pxmx1

dir: local
        path /var/lib/vz
        content iso,snippets
        maxfiles 0
        shared 0
I have the corresponding keyring in /etc/pve/priv/ceph/ceph_pxmx1.keyring

Regards,
Cédric
 
if you cant to take a look at the ceph.conf file of the ceph nodes :
Code:
root@ceph-am7-1:~# cat /etc/ceph/ceph.conf 
[global]
fsid = fe4cccf5-89cb-4922-88a3-7525bf676581
mon_initial_members = ceph-am7-1, ceph-am7-2, ceph-am7-3
mon_host = 172.18.7.51,172.18.7.52,172.18.7.53
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public network=172.18.7.0/24

osd pool default size = 3
osd pool default min size = 2
 
What version is the ceph cluster? Is the image creation done through the PVE nodes? What are the image features of a image that holds the lock?
 
What version is the ceph cluster?
ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)

Is the image creation done through the PVE nodes?
Yes

What are the image features of a image that holds the lock?
Code:
root@ceph-am7-1:~# rbd info --pool c7000-pxmx1-am7 vm-201-disk-0
rbd image 'vm-201-disk-0':
        size 15GiB in 3840 objects
        order 22 (4MiB objects)
        block_name_prefix: rbd_data.1ec76b8b4567
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        flags: 
        create_timestamp: Wed Feb 20 11:42:29 2019
 
Does the user 'pxmx1' have the right caps? Does a live migration work?
 
everything is working fine except this lock problem when a node crashes. so yes, live migration works perfectly
ceph permissions are :

Code:
client.pxmx1
        key: AQDPEGxcA+fJGxAAljqFfiQMrFthiFqic0JWEw==
        caps: [mon] allow r
        caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=c7000-pxmx1-am7
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!