After update of Nov 14 - monitors fail to start

Sep 11, 2019
26
1
8
55
I ran the updates which installed a new kernel. after the reboot the monitor did not start. Attempted to start from command line:

systemctl status ceph-mon@proxp01.service
ceph-mon@proxp01.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
Drop-In: /lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: failed (Result: exit-code) since Fri 2019-11-15 11:57:55 EST; 47s ago
Process: 11469 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id proxp01 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Main PID: 11469 (code=exited, status=1/FAILURE)

Nov 15 11:57:55 putsproxp01 systemd[1]: ceph-mon@proxp01.service: Service RestartSec=10s expired, scheduling restart.
Nov 15 11:57:55 putsproxp01 systemd[1]: ceph-mon@proxp01.service: Scheduled restart job, restart counter is at 5.
Nov 15 11:57:55 putsproxp01 systemd[1]: Stopped Ceph cluster monitor daemon.
Nov 15 11:57:55 putsproxp01 systemd[1]: ceph-mon@proxp01.service: Start request repeated too quickly.
Nov 15 11:57:55 putsproxp01 systemd[1]: ceph-mon@proxp01.service: Failed with result 'exit-code'.
Nov 15 11:57:55 putsproxp01 systemd[1]: Failed to start Ceph cluster monitor daemon.

the log tail looks like this:

2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: Options.compaction_readahead_size: 0
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: Compression algorithms supported:
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kZSTDNotFinalCompression supported: 0
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kZSTD supported: 0
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kXpressCompression supported: 0
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kLZ4HCCompression supported: 1
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kLZ4Compression supported: 1
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kBZip2Compression supported: 0
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kZlibCompression supported: 1
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: kSnappyCompression supported: 1
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: Fast CRC32 supported: Supported on x86
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: [db/db_impl.cc:390] Shutdown: canceling all background work
2019-11-15 11:57:45.560 7f5f099fc440 4 rocksdb: [db/db_impl.cc:563] Shutdown complete
2019-11-15 11:57:45.560 7f5f099fc440 -1 rocksdb: IO error: while open a file for lock: /var/lib/ceph/mon/ceph-proxp01/store.db/LOCK: Permission denied
2019-11-15 11:57:45.560 7f5f099fc440 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-proxp01': (22) Invalid argument

I have 6 nodes, they are all doing the same thing.
 
Did you tried to start it as root directly once?

What does
Code:
ls -al /var/lib/ceph/mon/*
find /var/lib/ceph/ ! -user ceph

outputs? Especially the second one should only output the bootstrap-osd ceph keyring, and maybe the crash directory.
Are all, especially the files ceph tries to access in the logs, owend by the "ceph" user?