We upgraded our proxmox from 5.4 to 6.1.5, so far some of the issue we encounter has been solved but one that we can't solve.
And from Luminous to Nautilus
We got a Health_warn that says
the time frame is still the same even if we restart the monitor
Also in our logs when we are restarting there is this error
We are following this guide URL: https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus
we also applied this part of the guide
But ending up with the same result. Can anyone please enlighten us, on what we should do.
cat /etc/ceph.ceph.conf
And on the GUI Ceph>Monitor it shows that all monitor is running.
And from Luminous to Nautilus
We got a Health_warn that says
mon.pxceph crashed on host pxceph at 2019-12-20 20:48:37.776154Z
the time frame is still the same even if we restart the monitor
Also in our logs when we are restarting there is this error
Dec 20 18:38:20 pxceph ceph-mon[142699]: 2019-12-20 18:38:20.555 7fc093ff4280 -1 WARNING: 'mon addr' config option v1:10.10.20.20:6789/0 does not match monmap file
Dec 20 18:38:20 pxceph ceph-mon[142699]: continuing with monmap configuration
Dec 20 18:38:21 pxceph ceph-mon[142699]: 2019-12-20 18:38:21.179 7fc08baf4700 -1 mon.pxceph@0(electing) e6 failed to get devid for : fallback method has serial ''but no model
Dec 20 18:38:21 pxceph ceph-mon[142699]: 2019-12-20 18:38:21.199 7fc08baf4700 -1 mon.pxceph@0(electing) e6 failed to get devid for : fallback method has serial ''but no model
We are following this guide URL: https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus
we also applied this part of the guide
Enable msgrv2 protocol and update Ceph configuration
To enable the new v2 network protocol, issue the following command:
ceph mon enable-msgr2
This will instruct all monitors that bind to the old default port 6789 for the legacy v1 protocol to also bind to the new 3300 v2 protocol port. To see if all monitors have been updated run
ceph mon dump
and verify that each monitor has both a v2: and v1: address listed.
Updating /etc/pve/ceph.conf
For each host that has been upgraded, you should update your /etc/pve/ceph.conf file so that it either specifies no monitor port (if you are running the monitors on the default ports) or references both the v2 and v1 addresses and ports explicitly. Things will still work if only the v1 IP and port are listed, but each CLI instantiation or daemon will need to reconnect after learning the monitors also speak the v2 protocol, slowing things down a bit and preventing a full transition to the v2 protocol.
It is recommended to add all monitor ips (without port) to 'mon_host' in the global section like this:
[global]
...
mon_host = 10.0.0.100 10.0.0.101 10.0.0.102
But ending up with the same result. Can anyone please enlighten us, on what we should do.
cat /etc/ceph.ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.10.0/24
debug ms = 0/0
fsid = 672d9ca3-b4b4-4313-9ecb-1dd02e8da71d
mon allow pool delete = true
mon_host = 10.10.20.20 10.10.20.21 10.10.20.22
osd deep scrub interval = 1209600
osd scrub begin hour = 19
osd scrub end hour = 6
osd scrub sleep = 0.1
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network = 10.10.20.0/24
bluestore_block_db_size = 40000000000
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mon.pxceph]
host = pxceph
mon addr = 10.10.20.20:6789
[mon.pxceph2]
host = pxceph2
mon addr = 10.10.20.21:6789
[mon.pxceph3]
host = pxceph3
mon addr = 10.10.20.22:6789
And on the GUI Ceph>Monitor it shows that all monitor is running.
Last edited: