Proxmox Cluster 3 nodes, Monitors refuse to start

Danny-10-10

New Member
Sep 24, 2024
21
2
3
I posted this in the wrong section before so i am posting this here hoping is the right place.

Hi all, i am facing a strange issue, after using having a proxmox pc for my self hosted app I decided to play around and create a cluter to dive deeper into the HA topics, i dowloaded the latest ISO and build up a cluster from scratch. My Cluster works, i can see every node, my ceph storage says everythng is ok. Managers works on all 3 node, metadata is ok on all 3 nodes but Monitor started only on the first node. When i try to make it start on the others node, nothing happen.
This is the syslog of the second node

Oct 28 00:13:47 pve2 ceph-mon[1041]: 2025-10-28T00:13:47.531+0100 7265f2d4c6c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 2949170) UID: 0
Oct 28 00:13:47 pve2 ceph-mon[1041]: 2025-10-28T00:13:47.531+0100 7265f2d4c6c0 -1 mon.pve2@0(leader) e1 *** Got Signal Hangup ***
Oct 28 00:13:47 pve2 ceph-mon[1041]: 2025-10-28T00:13:47.554+0100 7265f2d4c6c0 -1 received signal: Hangup from (PID: 2949171) UID: 0
Oct 28 00:13:47 pve2 ceph-mon[1041]: 2025-10-28T00:13:47.554+0100 7265f2d4c6c0 -1 mon.pve2@0(leader) e1 *** Got Signal Hangup ***

this is from the third node

ct 28 00:48:10 pve3 ceph-mon[1030]: 2025-10-28T00:48:10.850+0100 7f59362b76c0 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror cephfs-mirror (PID: 740342) UID: 0
Oct 28 00:48:10 pve3 ceph-mon[1030]: 2025-10-28T00:48:10.852+0100 7f59362b76c0 -1 mon.pve3@0(leader) e1 *** Got Signal Hangup ***
Oct 28 00:48:10 pve3 ceph-mon[1030]: 2025-10-28T00:48:10.871+0100 7f59362b76c0 -1 received signal: Hangup from (PID: 740343) UID: 0
Oct 28 00:48:10 pve3 ceph-mon[1030]: 2025-10-28T00:48:10.871+0100 7f59362b76c0 -1 mon.pve3@0(leader) e1 *** Got Signal Hangup ***

I am kinda stuck
 

Attachments

  • Screenshot 2025-10-21 103158.png
    Screenshot 2025-10-21 103158.png
    21 KB · Views: 7
  • Screenshot 2025-10-28 112840.png
    Screenshot 2025-10-28 112840.png
    127 KB · Views: 7
  • Screenshot 2025-10-28 112900.png
    Screenshot 2025-10-28 112900.png
    101.7 KB · Views: 7
Whats the output of
  • ceph -s
  • cat /etc/pve/ceph.conf
Please paste the output within [code][/code] tags or use the formatting buttons of the editor </>.
 
Thank you for your reply


ceph -s output

Code:
cluster:
    id:     b1e9e7bc-2ec5-4838-9702-7a66f1749bc3
    health: HEALTH_WARN
            2 OSD(s) experiencing slow operations in BlueStore
 
  services:
    mon: 1 daemons, quorum pve (age 13h)
    mgr: pve(active, since 13h), standbys: pve2, pve3
    mds: 1/1 daemons up, 2 standby
    osd: 3 osds: 3 up (since 13h), 3 in (since 7w)
 
  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 23.00k objects, 88 GiB
    usage:   263 GiB used, 1.1 TiB / 1.4 TiB avail
    pgs:     97 active+clean
 
  io:
    client:   49 KiB/s wr, 0 op/s rd, 9 op/s wr

cat /etc/pve/ceph.conf output
Code:
[global]
        auth_client_required = cephx
        auth_cluster_required = cephx
        auth_service_required = cephx
        cluster_network = 192.168.1.210/24
        fsid = b1e9e7bc-2ec5-4838-9702-7a66f1749bc3
        mon_allow_pool_delete = true
        mon_host = 192.168.1.210
        ms_bind_ipv4 = true
        ms_bind_ipv6 = false
        osd_pool_default_min_size = 2
        osd_pool_default_size = 3
        public_network = 192.168.1.210/24

[client]
        keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash]
        keyring = /etc/pve/ceph/$cluster.$name.keyring

[mds]
        keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.pve]
        host = pve
        mds_standby_for_name = pve

[mds.pve2]
        host = pve2
        mds_standby_for_name = pve

[mds.pve3]
        host = pve3
        mds_standby_for_name = pve

[mon.pve]
        public_addr = 192.168.1.210



PVE has monitor working (192.168.1.210)
PVE2 (192.168.1.209)
PVE3 (192.168.1.208)
 
I don't know if this poses a problem, but the network is not 100% correct. Instead of 192.168.1.210/24, it should be 192.168.1.0/24
 
I modified the file according your suggestion when I try to start the monitor the situation stated above in my first post dosn't change