mon.pxceph crashed on host pxceph, need help please.

nttec

Well-Known Member
Jun 1, 2016
95
0
46
40
We upgraded our proxmox from 5.4 to 6.1.5, so far some of the issue we encounter has been solved but one that we can't solve.

And from Luminous to Nautilus

We got a Health_warn that says

mon.pxceph crashed on host pxceph at 2019-12-20 20:48:37.776154Z

the time frame is still the same even if we restart the monitor

Also in our logs when we are restarting there is this error


Dec 20 18:38:20 pxceph ceph-mon[142699]: 2019-12-20 18:38:20.555 7fc093ff4280 -1 WARNING: 'mon addr' config option v1:10.10.20.20:6789/0 does not match monmap file
Dec 20 18:38:20 pxceph ceph-mon[142699]: continuing with monmap configuration
Dec 20 18:38:21 pxceph ceph-mon[142699]: 2019-12-20 18:38:21.179 7fc08baf4700 -1 mon.pxceph@0(electing) e6 failed to get devid for : fallback method has serial ''but no model
Dec 20 18:38:21 pxceph ceph-mon[142699]: 2019-12-20 18:38:21.199 7fc08baf4700 -1 mon.pxceph@0(electing) e6 failed to get devid for : fallback method has serial ''but no model


We are following this guide URL: https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus

we also applied this part of the guide

Enable msgrv2 protocol and update Ceph configuration
To enable the new v2 network protocol, issue the following command:

ceph mon enable-msgr2

This will instruct all monitors that bind to the old default port 6789 for the legacy v1 protocol to also bind to the new 3300 v2 protocol port. To see if all monitors have been updated run

ceph mon dump

and verify that each monitor has both a v2: and v1: address listed.

Updating /etc/pve/ceph.conf

For each host that has been upgraded, you should update your /etc/pve/ceph.conf file so that it either specifies no monitor port (if you are running the monitors on the default ports) or references both the v2 and v1 addresses and ports explicitly. Things will still work if only the v1 IP and port are listed, but each CLI instantiation or daemon will need to reconnect after learning the monitors also speak the v2 protocol, slowing things down a bit and preventing a full transition to the v2 protocol.

It is recommended to add all monitor ips (without port) to 'mon_host' in the global section like this:

[global]
...
mon_host = 10.0.0.100 10.0.0.101 10.0.0.102


But ending up with the same result. Can anyone please enlighten us, on what we should do.

cat /etc/ceph.ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.10.0/24
debug ms = 0/0
fsid = 672d9ca3-b4b4-4313-9ecb-1dd02e8da71d
mon allow pool delete = true
mon_host = 10.10.20.20 10.10.20.21 10.10.20.22
osd deep scrub interval = 1209600
osd scrub begin hour = 19
osd scrub end hour = 6
osd scrub sleep = 0.1
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network = 10.10.20.0/24
bluestore_block_db_size = 40000000000

[client]
keyring = /etc/pve/priv/$cluster.$name.keyring

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.pxceph]
host = pxceph
mon addr = 10.10.20.20:6789

[mon.pxceph2]
host = pxceph2
mon addr = 10.10.20.21:6789

[mon.pxceph3]
host = pxceph3
mon addr = 10.10.20.22:6789

And on the GUI Ceph>Monitor it shows that all monitor is running.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!