[SOLVED] Unable to create ceph-mgr or OSD on new setup

semira uthsala

Active Member
Nov 19, 2019
43
6
28
33
Singapore
Hi all,

My pve setup got 3x storage nodes where I installed ceph nautilus 14.2.9. I went through the GUI installation and created 3x mons after the initial ceph installation.

Right away after add 3x monitors it warns me "clock skew detected on mon.01 and mon.02. I configured NTP and time is correct and equal on all 3 nodes"

After mon setup I tried to add mgr on the other two nodes. ( 1 mgr already created from the configuration step ). and I'm getting timeout error every time I try to create a manager.

Code:
root@storage-node-02:~# pveceph mgr create
creating manager directory '/var/lib/ceph/mgr/ceph-storage-node-02'
creating keys for 'mgr.storage-node-02'
got timeout

I use two separate networks for frontend and cluster networks. both networks can reach on all three nodes.


After tried many times I purge the cluster using below commands

Code:
#!/bin/bash
rm -rf /etc/systemd/system/ceph*
killall -9 ceph-mon ceph-mgr ceph-mds
rm -rf /var/lib/ceph/mon/  /var/lib/ceph/mgr/  /var/lib/ceph/mds/
pveceph purge
rm -rf /etc/pve/ceph.conf
rm -rf /etc/ceph/ceph.conf
apt -y purge ceph-mon ceph-osd ceph-mgr ceph-mds
apt -y autoremove
rm /etc/init.d/ceph

And reboot the nodes and did the clean ceph install. and tried to create the manager. Still, I'm getting the same error. timeout when try to create ceph-mgr (GUI and CLI both)

Code:
root@storage-node-02:~# ceph -s
  cluster:
    id:     ed50689b-ba7b-4a2d-af27-f8007d22d8ff
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
            clock skew detected on mon.storage-node-02

  services:
    mon: 2 daemons, quorum storage-node-01,storage-node-02 (age 6m)
    mgr: no daemons active (since 6m)
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

After this, I tried to create OSD and I'm getting (timeout 500). Is there anything wrong with the proxmox version or ceph version I use?

Code:
root@storage-node-02:~# pveversion -v
proxmox-ve: 6.2-1 (running kernel: 5.4.34-1-pve)
pve-manager: 6.2-4 (running version: 6.2-4/9824574a)
pve-kernel-5.4: 6.2-1
pve-kernel-helper: 6.2-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.9-pve1
ceph-fuse: 14.2.9-pve1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.3
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-2
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-1
pve-cluster: 6.1-8
pve-container: 3.1-5
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-2
pve-qemu-kvm: 5.0.0-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
 
Last edited:
Hi All,

My NTP was not working even I configured properly due to firewall rule. I fix that and reinstalled the ceph clean. Now all working properly. I can create ceph-mgr and ceph-mons without any timeout issues. And no more " clock skew detected " warning also.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!