Ceph: New Monitors showing no Quorum

Discussion in 'Proxmox VE: Installation and configuration' started by MediaLab, Jun 19, 2017.

  1. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    I am having an issue with proxmox on using Ceph. Ceph itself is working great. But when I add a new proxmox unit with ceph, specifically adding a monitor it will show up as a monitor, but will show under the quorum tab quorum no.

    See attached screenshot.

    As you can see mon.3 and mon.4 have no for quorum. They are all on the same 10g network.
     

    Attached Files:

  2. tom

    tom Proxmox Staff Member
    Staff Member

    Joined:
    Aug 29, 2006
    Messages:
    13,538
    Likes Received:
    404
    but why do you use more than 3 monitors?

    3 is recommended.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    Not an issue of using more then 3 monitors. I was testing to see if I could add an additional monitors if I had to remove other monitors. Just bullet proof testing my proxmox setup.
     
  4. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    Actually this would probably be a good question. Will the proxmox Ceph allow more then 3 monitors in the quorum at once?
     
  5. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,270
    Likes Received:
    505
    of course. but the recommended number is 3 for small and 5 for medium/bigger clusters (although giving the monitors their own standalone nodes is probably more helpful than adding another two ;))
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  6. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    Where should I start in diagnosing this issue?
     
  7. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,270
    Likes Received:
    505
    check the logs on the hosts where those monitors are running ('journalctl -u "ceph-mon*"'), and check "ceph status" and "ceph -w" while restarting one of the non-quorate monitor instances.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    So I noticed that the two monitors that are not in quomrum do not have /0 after there ports on the WebGUI under the Monitor screen.

    service ceph status:
    ceph.service - LSB: Start Ceph distributed file system daemons at boot time
    Loaded: loaded (/etc/init.d/ceph)
    Active: active (exited) since Fri 2017-06-09 16:01:31 EDT; 1 weeks 6 days ago

    Jun 09 16:01:30 int-p-mid1 ceph[1811]: === mon.4 ===
    Jun 09 16:01:30 int-p-mid1 ceph[1811]: Starting Ceph mon.4 on int-p-mid1...
    Jun 09 16:01:30 int-p-mid1 ceph[1811]: Running as unit ceph-mon.4.1497038490.341517916.service.
    Jun 09 16:01:30 int-p-mid1 ceph[1811]: Starting ceph-create-keys on int-p-mid1...
    Jun 09 16:01:31 int-p-mid1 systemd[1]: Started LSB: Start Ceph distributed file system daemons at boot time.


    Ceph Status:
    2017-06-22 16:15:35.517812 7f05e8329700 0 -- :/1543168526 >> 192.168.110.200:67 89/0 pipe(0x7f05e4060030 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f05e405dd20).fault
    cluster 1c47df50-7ed0-47d9-af71-d92152a95edf
    health HEALTH_OK
    monmap e3: 3 mons at {0=192.168.110.202:6789/0,1=192.168.110.203:6789/0,2=1 92.168.110.201:6789/0}
    election epoch 116, quorum 0,1,2 2,0,1
    osdmap e1221: 15 osds: 15 up, 15 in
    flags sortbitwise,require_jewel_osds
    pgmap v1472042: 192 pgs, 2 pools, 450 GB data, 115 kobjects
    1345 GB used, 2411 GB / 3756 GB avail
    192 active+clean
    client io 32521 B/s rd, 1757 kB/s wr, 8 op/s rd, 42 op/s wr
    root@int-p-mid1:~# ceph status
    2017-06-22 16:15:49.365284 7f94c0301700 0 -- :/426723825 >> 192.168.110.204:6789/0 pipe(0x7f94bc060030 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f94bc05dd20).fault
    cluster 1c47df50-7ed0-47d9-af71-d92152a95edf
    health HEALTH_OK
    monmap e3: 3 mons at {0=192.168.110.202:6789/0,1=192.168.110.203:6789/0,2=192.168.110.201:6789/0}
    election epoch 116, quorum 0,1,2 2,0,1
    osdmap e1221: 15 osds: 15 up, 15 in
    flags sortbitwise,require_jewel_osds
    pgmap v1472052: 192 pgs, 2 pools, 450 GB data, 115 kobjects
    1345 GB used, 2411 GB / 3756 GB avail
    192 active+clean
    client io 84242 B/s rd, 385 kB/s wr, 19 op/s rd, 14 op/s wr

    Nothing out of the ordinary when restarting. What next? I am been banging my head on the table over this lol.
     
  9. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,270
    Likes Received:
    505
    how does your ceph config look like?
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  10. MediaLab

    MediaLab New Member

    Joined:
    Jun 19, 2017
    Messages:
    6
    Likes Received:
    0
    [global]
    auth client required = cephx
    auth cluster required = cephx
    auth service required = cephx
    cluster network = 192.168.110.0/24
    filestore xattr use omap = true
    fsid = 1c47df50-7ed0-47d9-af71-d92152a95edf
    keyring = /etc/pve/priv/$cluster.$name.keyring
    osd journal size = 5120
    osd pool default min size = 1
    public network = 192.168.110.0/24

    [osd]
    keyring = /var/lib/ceph/osd/ceph-$id/keyring

    [mon.2]
    host = int-p-edge1
    mon addr = 192.168.110.201:6789

    [mon.0]
    host = int-p-edge2
    mon addr = 192.168.110.202:6789

    [mon.4]
    host = int-p-mid1
    mon addr = 192.168.110.204:6789

    [mon.1]
    host = int-p-edge3
    mon addr = 192.168.110.203:6789

    [mon.3]
    host = int-p-offsite
    mon addr = 192.168.110.200:6789
     
  11. Tekuno-Kage

    Tekuno-Kage New Member

    Joined:
    Jun 1, 2016
    Messages:
    11
    Likes Received:
    4
    I know is an old treat, but no one report if is fix or not.
    Therefore, I suggest to check the mtu between nodes. I face that kind of problem
    .
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice