If you like the 'filestore design with journal' performance, you need to setup bcache as mentioned somewhere on the forum. Moving DB + WAL to ssd didn't improve write speed with noticeable factor.
1. When you move an osd from one node to another, you don't need to destroy and create it. It will be automagically discovered and added.
2. To remove an osd from cluster, use ceph osd purge {id} --yes-i-really-mean-it
3. In ceph commands you should use numeric osd id, i.e. ceph auth del 7...
My 2 cents:
1. pg_num should be a power of 2 (in this case, 1024).
2. You did too many jobs as once. You should:
a) add first osd;
b) wait for HEALTH_OK;
c) add second osd;
d) wait for HEALTH_OK;
...
z) increase pg_num.
3. The 'too many pg per osd' warning warns you about real problem, you...
So now you have a problem with osd 0 and 11 (no space left on device?).
With
mon osd full ratio = .98
mon osd nearfull ratio = .95
you only disable the warning, it will not free space on your osds. Maybe reweight this osds will help.
Check what causing the peering problem:
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure
You have PGs stuck in activating state because of
Follow this
https://forum.proxmox.com/threads/ceph-problem-after-upgrade-to-5-1-slow-requests-stuck-request.38586/
and then wait for recovery to complete.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.