Problems using "Start at boot" with CEPH

Jonas S.

Member
Sep 21, 2016
15
7
23
40
I already have a solution for my problem but I want to share it. Also I like to have some feedback on the solution or another simpler solution? Maybe its all my fault as it is the first time I am using CEPH with Proxmox :).

Server setup (all servers run 4.3-1/e7cdc165):
  • 2 x Storage Server which dont run any VM, these run both a CEPH monitor and several OSD
  • 2 x Servers wich run the virtual machines, these only run CEPH monitor
  • all Servers are in a cluster, storage connections are 10GBe
  • there are two rings for Corosync, each on different NICs and switches
I had the following Problems:
  • VM´s do get startet but they dont ... what this means is Proxmox thinks CEPH Storage is available and starts the VM´s -> they appear startet in UI
  • Proxmox displays CEPH Storage as available but trying to show the contents fails/times out
  • -> CEPH Storage is not available
Why is this happening?
Well looking at "journalctl -b | grep ceph" it seems that the OSD´s cant be startet:
Code:
Sep 30 15:07:27 pvestorage1 ceph[1515]: === osd.0 ===
Sep 30 15:07:33 pvestorage1 ceph[1515]: 2016-09-30 15:07:33.430172 7fc4c4489700  0 -- 192.168.10.10:0/1157019147 >> 192.168.10.200:6789/0 pipe(0x7fc4b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc4b0004ef0).fault
Sep 30 15:07:39 pvestorage1 ceph[1515]: 2016-09-30 15:07:39.429507 7fc4c468b700  0 -- 192.168.10.10:0/1157019147 >> 192.168.10.200:6789/0 pipe(0x7fc4b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc4b0006470).fault
Sep 30 15:07:45 pvestorage1 ceph[1515]: 2016-09-30 15:07:45.429287 7fc4c458a700  0 -- 192.168.10.10:0/1157019147 >> 192.168.10.200:6789/0 pipe(0x7fc4b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc4b0004ea0).fault
Sep 30 15:07:51 pvestorage1 ceph[1515]: 2016-09-30 15:07:51.429223 7fc4c4489700  0 -- 192.168.10.10:0/1157019147 >> 192.168.10.200:6789/0 pipe(0x7fc4b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc4b0004ea0).fault
Sep 30 15:07:58 pvestorage1 ceph[1515]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 0.87 host=pvestorage1 root=default'
Sep 30 15:07:58 pvestorage1 ceph[1515]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.0']' returned non-zero exit status 1
Sep 30 15:07:58 pvestorage1 ceph[1515]: ceph-disk: Error: One or more partitions failed to activate

"journalctl -b | grep mount" shows:
Code:
Sep 30 15:06:57 pvestorage1 kernel: XFS (sdc1): Ending clean mount
Sep 30 15:07:27 pvestorage1 kernel: XFS (sdb1): Ending clean mount

I dont know why Proxmox still thinks the OSD´s are running. After seeing this I tried to start the OSD´s manually with "pveceph start". This works, all OSD´s are startet and I can access the RBD images.
To automate this so I dont have to manually start CEPH and VM´s I wrote the following two scripts. The first script starts the OSD daemons and the second script starts all VM´s which are tagged to start at boot. One script runs everywhere CEPH OSD daemons need to be startet and the second everywhere VM´s need to be startet.

I call both scripts in "/etc/rc.local".

First script to start CEPH OSD daemons:
Code:
#!/bin/sh

# This Script checks if enough monitors/IPs are up to start the OSD´s
MON_PORT="6789"

MON_NAME1="192.168.10.10"
MON_NAME2="192.168.10.20"
MON_NAME3="192.168.10.100"
MON_NAME4="192.168.10.200"

LOOP=true

while $LOOP
do
  if `nc -z $MON_NAME1 $MON_PORT 2>/dev/null` &&
     `nc -z $MON_NAME2 $MON_PORT 2>/dev/null` &&
     `nc -z $MON_NAME3 $MON_PORT 2>/dev/null` &&
     `nc -z $MON_NAME4 $MON_PORT 2>/dev/null`
    then
      pveceph start
      LOOP=false
    else
      sleep 10
  fi
done

Second script to start VM´s if CEPH Storage is available:
Code:
#!/bin/sh

# Tests if ceph storage is up and running

#STORAGE='vmimages fastvmimages'
STORAGE='vmimages'
GRACETIME=20
LOOP=true
while [ $LOOP = true ]
do
  LOOP=false
  for item in $STORAGE
  do
    `timeout 5 pvesh get nodes/localhost/storage/$item/content 2>1 1> /dev/null`
    if [ $? -ne 0 ]
    then
      LOOP=true
      break
    fi
  done

  if [ $LOOP = false ]
  then
    sleep $GRACETIME
    `pvesh create nodes/localhost/startall 2>1 1> /dev/null`
  else
    sleep 10
  fi
done

With those scripts everything works fine after I reboot the cluster or a single node.

Best regards
Jonas Stunkat