We have alredy configured Proxmox Ha Cluster with Ceph storage. The cluster was stopped without any problems and now (after summer), none of our proxmox servers boot the ceph storage. Seems to be a problem with OSD's but I cannot find a solution.
Thanks.
SRV Ceph start Messages
=== osd.0 ===
2016-09-07 14:11:20.721132 7f2a67869700 0 -- :/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a6c061550 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a6c05a3f0).fault
2016-09-07 14:11:26.721138 7f2a67667700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00b0c0).fault
2016-09-07 14:11:29.653123 7f2a67869700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00cd90).fault
2016-09-07 14:11:32.721115 7f2a67667700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00e6f0).fault
2016-09-07 14:11:38.653117 7f2a67768700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00b570).fault
2016-09-07 14:11:44.653124 7f2a67869700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00c030).fault
2016-09-07 14:11:47.721053 7f2a67768700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c010a50).fault
failed: 'timeout 30 /usr/bin/ceph -c /etc/pve/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 1.82 host=46024953-HV1 root=default'
=== mon.0 ===
Starting Ceph mon.0 on 46024953-HV1...already running
=== osd.0 ===
2016-09-07 14:11:53.721126 7f06d429b700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0004ef0).fault
2016-09-07 14:11:59.653122 7f06d449d700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0006470).fault
2016-09-07 14:12:02.721151 7f06d429b700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c00080e0 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c00054e0).fault
2016-09-07 14:12:08.653111 7f06d439c700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c00080e0 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0005750).fault
2016-09-07 14:12:14.653121 7f06d449d700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c00080e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0011350).fault
2016-09-07 14:12:17.721113 7f06d439c700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0004ea0).fault
failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 1.82 host=46024953-HV1 root=default'
ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.0']' returned non-zero exit status 1
ceph-disk: Error: One or more partitions failed to activate
TASK ERROR: command 'setsid service ceph -c /etc/pve/ceph.conf start ''' failed: exit code 1
Thanks.
SRV Ceph start Messages
=== osd.0 ===
2016-09-07 14:11:20.721132 7f2a67869700 0 -- :/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a6c061550 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a6c05a3f0).fault
2016-09-07 14:11:26.721138 7f2a67667700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00b0c0).fault
2016-09-07 14:11:29.653123 7f2a67869700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00cd90).fault
2016-09-07 14:11:32.721115 7f2a67667700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00e6f0).fault
2016-09-07 14:11:38.653117 7f2a67768700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00b570).fault
2016-09-07 14:11:44.653124 7f2a67869700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.238:6789/0 pipe(0x7f2a5c006e20 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c00c030).fault
2016-09-07 14:11:47.721053 7f2a67768700 0 -- 192.168.1.240:0/1138065289 >> 192.168.1.239:6789/0 pipe(0x7f2a5c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2a5c010a50).fault
failed: 'timeout 30 /usr/bin/ceph -c /etc/pve/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 1.82 host=46024953-HV1 root=default'
=== mon.0 ===
Starting Ceph mon.0 on 46024953-HV1...already running
=== osd.0 ===
2016-09-07 14:11:53.721126 7f06d429b700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0004ef0).fault
2016-09-07 14:11:59.653122 7f06d449d700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0006470).fault
2016-09-07 14:12:02.721151 7f06d429b700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c00080e0 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c00054e0).fault
2016-09-07 14:12:08.653111 7f06d439c700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c00080e0 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0005750).fault
2016-09-07 14:12:14.653121 7f06d449d700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.238:6789/0 pipe(0x7f06c00080e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0011350).fault
2016-09-07 14:12:17.721113 7f06d439c700 0 -- 192.168.1.240:0/1105037462 >> 192.168.1.239:6789/0 pipe(0x7f06c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f06c0004ea0).fault
failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 1.82 host=46024953-HV1 root=default'
ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.0']' returned non-zero exit status 1
ceph-disk: Error: One or more partitions failed to activate
TASK ERROR: command 'setsid service ceph -c /etc/pve/ceph.conf start ''' failed: exit code 1