Ceph Help: Reduced data availability & Degraded data redundancy

standard_user · Aug 13, 2023

So I have been trying to get a single node, promox ceph server up all weekend.. I have successfully done a fresh install of Proxmox 8.0.4 and Ceph Quincy, but after setting up Ceph with:

osd_pool_default_min_size = 2
osd_pool_default_size = 3
osd_crush_chooseleaf_type = 0

I get a Ceph warnings of:
HEALTH_WARN Reduced data availability: 1 pgs inactive;
Degraded data redundancy: 1 pgs undersized

This is a brand new cluster, with only the .mgr pool. The pgs must belong to the .mgr pool, but I don't understand why its undersized.

ceph pg ls outputs:
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE SINCE VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP LAST_SCRUB_DURATION SCRUB_SCHEDULING
1.0 0 0 0 0 0 0 0 0 undersized+peered 3m 0'0 25:10 [1]p1 [1]p1 2023-08-13T17:33:37.266872-0500 2023-08-13T17:33:37.266872-0500 0 periodic scrub scheduled @ 2023-08-15T06:15:09.555827+0000

I saw other post with similar issues that said to try "ceph pg <PG_ID> mark_unfound_lost {revert|delete}", I tried "Ceph pg 1.0 mark_unfound_lost {revert|delete}" & "ceph pg 1.0 mark_unfound_lost delete" but I couldn't get the command to execute.

Thanks in advance to anyone who comments!

gurubert · Aug 15, 2023

Please show the output of "ceph osd tree" and "ceph -s".

standard_user · Aug 15, 2023

gurubert said:
Please show the output of "ceph osd tree" and "ceph -s".

ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 30.47404 root default
-3 30.47404 host charmander
0 hdd 7.27739 osd.0 up 1.00000 1.00000
1 hdd 7.27739 osd.1 up 1.00000 1.00000
2 hdd 7.27739 osd.2 up 1.00000 1.00000
3 hdd 7.27739 osd.3 up 1.00000 1.00000
4 ssd 0.90970 osd.4 up 1.00000 1.00000
5 ssd 0.45479 osd.5 up 1.00000 1.00000

ceph -s
cluster:
id: c8fd1b0b-8264-4ec6-b372-0b429e99ee66
health: HEALTH_WARN
Reduced data availability: 1 pg inactive
Degraded data redundancy: 1 pg undersized

services:
mon: 2 daemons, quorum charmander,squirtle (age 3m)
mgr: charmander(active, since 3m)
osd: 6 osds: 6 up (since 36s), 6 in (since 55s)

data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 38 MiB used, 30 TiB / 30 TiB avail
pgs: 100.000% pgs not active
1 undersized+peered

gurubert · Aug 15, 2023

You only have one host.

Ceph wants to place the three copies on different hosts.

This is what undersized means. The placement group has too few OSDs to place the copies on.

You need to add at least two hosts with OSDs.

Ceph works well with five nodes and more.

If you only have one host use ZFS as storage for Proxmox.

standard_user · Aug 15, 2023

gurubert said:
You only have one host.

Ceph wants to place the three copies on different hosts.

This is what undersized means. The placement group has too few OSDs to place the copies on.

You need to add at least two hosts with OSDs.

Ceph works well with five nodes and more.

If you only have one host use ZFS as storage for Proxmox.

Yes I am trying to do this on one host (more specifically, one OSD), I know that Ceph tries to spread the copies over multiple hosts, but I thought the "osd_crush_chooseleaf_type = 0" variable in ceph removes this two host with OSD requirement. I know I can have a VM within my primary proxmox node with passthrough disks (to get a second OSD), but I think it is a "messy" solution. I have reasons to do my configuration like this (preparing for future expansion, and an off-site full backup to mitigate the single OSD fail point).

Is there any way to have a single OSD within ceph?

gurubert · Aug 15, 2023

According to the info you have 6 OSDs.
To spread the copies on one host you need to edit the CRUSH rule and set the failure domain from host to osd.
You could also just add a new rule with failure domain osd and then change the pool(s) to use that.
In the future you can change it back and Ceph will distribute the copies over multiple hosts.

standard_user · Aug 15, 2023

gurubert said:
According to the info you have 6 OSDs.
To spread the copies on one host you need to edit the CRUSH rule and set the failure domain from host to osd.
You could also just add a new rule with failure domain osd and then change the pool(s) to use that.
In the future you can change it back and Ceph will distribute the copies over multiple hosts.

My mistake, yes I have 6 OSDs on one host.

I am looking at the Ceph documentation now (https://docs.ceph.com/en/reef/rados/operations/crush-map/). Do you know how to change the crush rule and failure domains?

gurubert · Aug 16, 2023

To edit the default CRUSH rule you have to manually edit the CRUSH map: https://docs.ceph.com/en/quincy/rados/operations/crush-map-edits/

It may be easier to create a new rule: https://docs.ceph.com/en/quincy/rados/operations/crush-map/#creating-a-rule-for-a-replicated-pool
and then to edit the pool(s) to use it: https://docs.ceph.com/en/quincy/rados/operations/pools/#setting-pool-values

Maximiliano · Aug 16, 2023

Hello, Could you please share with us what are you trying to accomplish with this setup? In general clusters consisting of a single node are discouraged.

Search

Search

Ceph Help: Reduced data availability & Degraded data redundancy

standard_user

New Member

gurubert

Distinguished Member

standard_user

New Member

gurubert

Distinguished Member

standard_user

New Member

gurubert

Distinguished Member

standard_user

New Member

gurubert

Distinguished Member

Maximiliano

Proxmox Staff Member