ceph crush map

RobFantini

Famous Member
May 24, 2012
2,009
102
133
Boston,Mass
i did not want to hijack this thread https://forum.proxmox.com/threads/hardware-concept-for-ceph-cluster.32814/
so am asking here.

I'd like to know how to do this, I'll try to remember to put it to wiki.
"make sure you have the CRUSH MAPS set to use HOST instead of OSD for the replication <<< TBD
meaning you could loose a full OSD or HOST and still have a replication of 2."

So I have 3 pve/osd nodes. with same exact storage . pools are sized 3/2 .

what are the cli commands to get idea crush map host based setting?


thank you in advance.
 
what is your current crush map? AFAIK recent versions of Ceph should set this correctly by default (i.e., your ruleset should end with step take default step chooseleaf firstn 0 type host step emit)
 
from gui copy and paste was one long line.

Code:
ceph osd getcrushmap -o  fo
crushtool -d  fo  -o crushmap.txt

Code:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 device6
device 7 osd.7
device 8 osd.8
device 9 osd.9
device 10 osd.10
device 11 osd.11
device 12 osd.12
device 13 osd.13
device 14 osd.14
device 15 osd.15
device 16 osd.16
device 17 osd.17
device 18 osd.18

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host sys12 {
        id -2           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys20 {
        id -3           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys24 {
        id -4           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys4 {
        id -5           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys6 {
        id -6           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys13 {
        id -7           # do not change unnecessarily
        # weight 0.000
        alg straw
        hash 0  # rjenkins1
}
host sys10 {
        id -8           # do not change unnecessarily
        # weight 2.590
        alg straw
        hash 0  # rjenkins1
        item osd.10 weight 0.432
        item osd.11 weight 0.432
        item osd.12 weight 0.432
        item osd.18 weight 0.432
        item osd.14 weight 0.432
        item osd.13 weight 0.432
}
host sys5 {
        id -9           # do not change unnecessarily
        # weight 2.590
        alg straw
        hash 0  # rjenkins1
        item osd.15 weight 0.432
        item osd.16 weight 0.432
        item osd.17 weight 0.432
        item osd.0 weight 0.432
        item osd.1 weight 0.432
        item osd.2 weight 0.432
}
host sys8 {
        id -10          # do not change unnecessarily
        # weight 2.590
        alg straw
        hash 0  # rjenkins1
        item osd.7 weight 0.432
        item osd.8 weight 0.432
        item osd.9 weight 0.432
        item osd.5 weight 0.432
        item osd.3 weight 0.432
        item osd.4 weight 0.432
}
root default {
        id -1           # do not change unnecessarily
        # weight 7.769
        alg straw
        hash 0  # rjenkins1
        item sys12 weight 0.000
        item sys20 weight 0.000
        item sys24 weight 0.000
        item sys4 weight 0.000
        item sys6 weight 0.000
        item sys13 weight 0.000
        item sys10 weight 2.590
        item sys5 weight 2.590
        item sys8 weight 2.590
}

# rules
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map
 
Last edited:
Hello Fabian

when you get a chance, can you check our crush map to make sure that we are
set to use HOST instead of OSD for the replication?

thanks.
 
yes, like I said:
Code:
       step take default
       step chooseleaf firstn 0 type host
       step emit

means
Code:
step take default => start at the default root of the hierarchy
step chooseleaf firstn 0 type host => is a shortcut for
     step choose 0 type host => choose as many hosts as you need
     step choose 1 type osd => for each host, pick a single osd
step emit => done
 
your crush map balances over hosts, not over osds - so you have what you want.
 
I have a issue. I create a rules in crush map, the rules are desing to ssd,nmve and hdd, but now I need move the disk partitions , for example ssd partitions= osd.4 to my rule replicated_ssd.

what can I do? I think I need some step top config the rules with osd.4 but I'm now sure what command I need to use.


# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class ssd
device 2 osd.2 class hdd
device 3 osd.3 class ssd
device 4 osd.4 class ssd
device 5 osd.5 class ssd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class ssd
device 9 osd.9 class ssd
device 10 osd.10 class ssd
device 11 osd.11 class hdd
device 12 osd.12 class ssd
device 13 osd.13 class ssd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 19 osd.19 class hdd
device 20 osd.20 class hdd
device 21 osd.21 class hdd
device 26 osd.26 class nvme
device 27 osd.27 class nvme

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host R1-12-15 {
id -3 # do not change unnecessarily
id -5 class hdd # do not change unnecessarily
id -2 class ssd # do not change unnecessarily
id -22 class nvme # do not change unnecessarily
# weight 14.632
alg straw2
hash 0 # rjenkins1
item osd.2 weight 5.451
item osd.5 weight 1.865
item osd.10 weight 1.865
item osd.11 weight 5.451
}
host R1-8-11 {
id -7 # do not change unnecessarily
id -9 class hdd # do not change unnecessarily
id -4 class ssd # do not change unnecessarily
id -23 class nvme # do not change unnecessarily
# weight 9.176
alg straw2
hash 0 # rjenkins1
item osd.6 weight 2.723
item osd.7 weight 2.723
item osd.8 weight 1.865
item osd.9 weight 1.865
}
host R2-9-12 {
id -10 # do not change unnecessarily
id -12 class hdd # do not change unnecessarily
id -8 class ssd # do not change unnecessarily
id -24 class nvme # do not change unnecessarily
# weight 0.000
alg straw2
hash 0 # rjenkins1
}
host R2-4-8 {
id -13 # do not change unnecessarily
id -14 class hdd # do not change unnecessarily
id -15 class ssd # do not change unnecessarily
id -25 class nvme # do not change unnecessarily
# weight 7.369
alg straw2
hash 0 # rjenkins1
item osd.12 weight 1.865
item osd.13 weight 1.865
item osd.14 weight 2.729
item osd.15 weight 0.910
}
host R-29-12 {
id -16 # do not change unnecessarily
id -17 class hdd # do not change unnecessarily
id -18 class ssd # do not change unnecessarily
id -26 class nvme # do not change unnecessarily
# weight 14.632
alg straw2
hash 0 # rjenkins1
item osd.0 weight 5.451
item osd.1 weight 1.865
item osd.3 weight 1.865
item osd.19 weight 5.451
}
host R2-14-17 {
id -19 # do not change unnecessarily
id -20 class hdd # do not change unnecessarily
id -21 class ssd # do not change unnecessarily
id -27 class nvme # do not change unnecessarily
# weight 5.490
alg straw2
hash 0 # rjenkins1
item osd.26 weight 0.931
item osd.27 weight 0.931
item osd.20 weight 0.904
item osd.21 weight 0.910
item osd.4 weight 1.814
}
root default {
id -1 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
id -11 class ssd # do not change unnecessarily
id -28 class nvme # do not change unnecessarily
# weight 51.297
alg straw2
hash 0 # rjenkins1
item R1-12-15 weight 14.632
item R1-8-11 weight 9.175
item R2-9-12 weight 0.000
item R2-4-8 weight 7.368
item R-29-12 weight 14.632
item R2-14-17 weight 5.490
}

# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule replicated_ssd {
id 1
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
}
rule replicated_nvme {
id 2
type replicated
min_size 1
max_size 10
step take default class nvme
step chooseleaf firstn 0 type host
step emit
}
rule replicated_hdd {
id 3
type replicated
min_size 1
max_size 10
step take default class hdd
step chooseleaf firstn 0 type host
step emit
}

# end crush map
 
@Miguel Moreira, I don't exactly understand what you mean. But your SSDs are targeted by the rule 'replicated_ssd', as the device class is ssd. To use this rule you need to either create or set the rule for an existing pool (immediate rebalance starts).
 
My cuestion is, how I create ceph, and vm storage, before to create crushmap. my vm's are living in the default rule, this rule include all disk, but now, I want to move some servers to the new rules, my question is about the process..
Understanding, now I need create new pool, and after take one vm, and move this disk to the new pool, is right?


Thanks for your help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!