Until you have inactive (activating) pgs your cluster will be useless.
My cluster is now fixed and we are testing IO operations on it. I will post solution, but I need to translate my notes ;)
I'll be back with reply.
osd_max_backfills will increase speed of recovering cluster. Mine is recovered as much as it can. I will get rid of those activing+remapped pgs cluster will be fine.
I found two OSDs which are in every pgs inactive as backfilling target. For some reason they don't want to start this process, so...
We've tried, but it doesn't help. Amount of pgs inactive has decrease, but still exists.
...
Reduced data availability: 16 pgs inactive
..
4001 active+clean
42 active+undersized+degraded+remapped+backfill_wait
31...
Cluster will reach 3994 active+clean and stuck.
First slow requests appears when I upgrade PVE on last node. After reboot I saw one osd.16 dead and all VMs freezes.
I've upgrade from Hammer to Jewel about month ago. From Jewel to Luminous (4.4 pve) on wednesday and it was working fine few...
OSD was down and out, so I destroyed it from Proxmox PVE.
I've bought RedHat Support and there info:
Ceph CRUSH map doesn't like holes, so after the OSD is deleted, these deviceN entries get added in the devices tag in CRUSH map.
When any new OSD is added back to cluster, it will take the...
Yes, but it doesn't help.
After night is the same situation:
root@biurowiecH:~# ceph -s
cluster:
id: dcd25aa1-1618-45a9-b902-c081d3fa3479
health: HEALTH_ERR
Reduced data availability: 83 pgs inactive, 3 pgs stale
Degraded data redundancy: 83 pgs unclean...
Disks are spinners with dedicated journal partition on SSD.
Yes, once - long time ago. Then this worked, so we left it in configuration.
Maybe I didn't describe properly my issue. Stuck + slow requests hangs every VM using rados. Amount of them are only increasing, it seems as they won't ever...
I've got 10Gbps network (2 switches - private cluster and public).
Our target was 5 mons, thanks for the tip.
With 5-th mon we are planning adding additional OSDs.
Cluster with this configuration was working few months with no problems till today upgrade.
4 mons because after upgrade I was planning adding another one. That's also why I have max_pg_per_osd warning.
ceph.conf:
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.6.6.0/24...
Hi,
I've got problem with my CEPH cluster.
Cluster specification:
4x node
4x mon
4x mgr
37x osd
I was starting from CEPH hammer so I followed tutorials:
https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel - without any problems
https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous - without any...
I've got 2 proxmox clusters - one with hammer ceph cluster (bigger) and second one working as a ceph client with dedicated pool (smaller). I wanted to first upgrade smaller cluster before bigger one with data. After upgrading two nodes of smaller cluster to proxmox 5.1 they are working properly...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.