Hello All,
I have allocated two nodes in my PVE cluster for storage. The idea is to use cephfs wherever possible, and possibly set up ganesha NFS or samba file share export in containers on the storage nodes for applications that need those.
I don't particularly trust the disks I have now: all are out of warranty, an eclectic collection ranging from 10TB 'enterprize' Seagate from 2017 to 3TB WD 'red' from early 2010s. Since this is a lab setup, I'd like to burn through what I have before I start purchasing replacements. On the other hand, I'd like to avoid losing data, it's a hassle to restore from backup.
I thought that by setting
I pasted my config below. Do I need to modify the crush rule? Or am I missing something else here?
If I do need to touch the crush rule, one approach I'm considering is to add another bucket type,
it would not imply that
Thanks
leveche
I have allocated two nodes in my PVE cluster for storage. The idea is to use cephfs wherever possible, and possibly set up ganesha NFS or samba file share export in containers on the storage nodes for applications that need those.
I don't particularly trust the disks I have now: all are out of warranty, an eclectic collection ranging from 10TB 'enterprize' Seagate from 2017 to 3TB WD 'red' from early 2010s. Since this is a lab setup, I'd like to burn through what I have before I start purchasing replacements. On the other hand, I'd like to avoid losing data, it's a hassle to restore from backup.
I thought that by setting
osd_pool_default_size = 6
with the default CRUSH rule I'd get 3-way mirror per node, on both storage nodes. Not very efficient or fast, but at least redundant enough. It seems I thought wrong: after creating a pool I see all its PGs in active+undersized
state, and ceph pg ls
shows that only two OSDs are used for allocation, e.g:
Code:
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE SINCE VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP LAST_SCRUB_DURATION SCRUB_SCHEDULING
...
2.0 0 0 0 0 0 0 0 0 active+undersized 7h 0'0 167:42 [17,5]p17 [17,5]p17 2023-02-28T10:10:55.591539-0700 2023-02-28T10:10:55.591539-0700 0 periodic scrub scheduled @ 2023-03-01T21:02:41.807648+0000
...
I pasted my config below. Do I need to modify the crush rule? Or am I missing something else here?
If I do need to touch the crush rule, one approach I'm considering is to add another bucket type,
controller
, between osd
and host
in the hierarchy. Is my understanding correct that the integer type identifier is not meaningful, and if I add e.g.
Code:
type 12 controller
it would not imply that
controller
becomes the root of the hierarchy?Thanks
leveche
Code:
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 169.254.121.12/24
mon_allow_pool_delete = true
mon_host = 169.254.121.12 169.254.121.11 169.254.121.118
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 4
osd_pool_default_size = 6
public_network = 169.254.121.12/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[mon.fsm1]
public_addr = 169.254.121.11
[mon.fsm2]
public_addr = 169.254.121.118
[mon.fsm3]
public_addr = 169.254.121.12
Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 zone
type 10 region
type 11 root
# buckets
host fsm1 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 32.74811
alg straw2
hash 0 # rjenkins1
item osd.0 weight 5.45799
item osd.3 weight 3.63869
item osd.5 weight 3.63869
item osd.7 weight 3.63869
item osd.9 weight 3.63869
item osd.10 weight 5.45799
item osd.11 weight 3.63869
item osd.12 weight 3.63869
}
host fsm3 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
# weight 41.71887
alg straw2
hash 0 # rjenkins1
item osd.1 weight 4.54839
item osd.2 weight 3.51369
item osd.4 weight 2.72899
item osd.6 weight 2.72899
item osd.8 weight 4.54839
item osd.13 weight 4.54839
item osd.14 weight 3.63869
item osd.15 weight 3.63869
item osd.16 weight 9.09569
item osd.17 weight 2.72899
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
# weight 74.46698
alg straw2
hash 0 # rjenkins1
item fsm1 weight 32.74811
item fsm3 weight 41.71887
}
# rules
rule replicated_rule {
id 0
type replicated
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map