Ceph stays in HEALTH_WARN

Norman Uittenbogaart

Renowned Member
Feb 28, 2012
150
5
83
Rotterdam, Netherlands, Netherlands
I created a ceph node, but the health never goes to healthy.
I increased the pgs size from default, but don't know what else I can try.

Clusters stays like the following,

cluster 6b91e476-9579-43a1-a589-52e01a49bcc6
health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean
monmap e1: 1 mons at {0=192.168.1.249:6789/0}, election epoch 2, quorum 0 0
osdmap e45: 7 osds: 7 up, 7 in
pgmap v99: 256 pgs, 1 pools, 0 bytes data, 0 objects
245 MB used, 3257 GB / 3257 GB avail
256 creating+incomplete

It is a brand new configuration.
Journals are on SSD, OSD's are created through the webinterface.

Why is it not fixing itself?
 
I created a ceph node, but the health never goes to healthy.
I increased the pgs size from default, but don't know what else I can try.

Clusters stays like the following,

cluster 6b91e476-9579-43a1-a589-52e01a49bcc6
health HEALTH_WARN 256 pgs incomplete; 256 pgs stuck inactive; 256 pgs stuck unclean
monmap e1: 1 mons at {0=192.168.1.249:6789/0}, election epoch 2, quorum 0 0
osdmap e45: 7 osds: 7 up, 7 in
pgmap v99: 256 pgs, 1 pools, 0 bytes data, 0 objects
245 MB used, 3257 GB / 3257 GB avail
256 creating+incomplete

It is a brand new configuration.
Journals are on SSD, OSD's are created through the webinterface.

Why is it not fixing itself?
Hi,
do you have the OSDs on three different hosts? Because the default replica for pools are 3.

If you have OSDs on 2 Nodes only, you can't get an healthy cluster.

Code:
ceph osd tree
ceph osd pool get rbd size

Udo
 
Hi,
do you have the OSDs on three different hosts? Because the default replica for pools are 3.

If you have OSDs on 2 Nodes only, you can't get an healthy cluster.

Code:
ceph osd tree
ceph osd pool get rbd size

Udo

For now I have 1 node with the OSD's, but it has enough OSD's to replicate.
Just for kicks I purged everything and restarted and set the pool size to 1, just to see if it would get healthy.

cluster e7720091-1647-4006-84f5-b627bf057609
health HEALTH_WARN 64 pgs stuck unclean
monmap e1: 1 mons at {0=192.168.1.249:6789/0}, election epoch 2, quorum 0 0
osdmap e16: 3 osds: 3 up, 3 in
pgmap v26: 64 pgs, 1 pools, 0 bytes data, 0 objects
102156 kB used, 1396 GB / 1396 GB avail
39 active
25 active+remapped

It keeps stuck on unclean.
And won't go to healthy

ceph osd tree
# id weight type name up/down reweight
-1 1.35 root default
-2 1.35 host nod3
0 0.45 osd.0 up 1
1 0.45 osd.1 up 1
2 0.45 osd.2 up 1


ceph osd pool get rbd size
size: 1
 
You would have to modify your crush map to allow it to replicate within a single host, by default as Udo said, you need to have at least 3 hosts for the replication to work with the default crush map