[SOLVED] CEPH Crush Map Help!

emilhozan · Oct 6, 2019

Hey all,

Just looking for some guidance. I read quite a few posts related to this and even the ceph docs. I think for the most part my crush map looks like I want it but there is a lot of extra info I can't figure out how to remove.

At a high level I want:
#buckets
1x Pool for SSDs
1x Pool for HDDs

#rules
1x rule to manage SSDs in the SSD bucket
1x rule to manage HDDs in the HDD bucket

If possible, can you verify the attached "desired_crushmap.txt" file to ensure my understand of how this is laid out is actually valid.
Also, I'd like to clean up my currently sloppy-looking crushmap, attached as "current_crushmap.txt".

Following the crush map docs and whatnot, the ideal crush map I'd like to see is the attached "desired_crushmap.txt"

However, when trying to push this crush map to the PVE cluster, I get error messages:
"ceph error einval failed to parse crushmap buffer:malformed_input bad magic number"

I tried looking it up, no dice. The command I'm using to push this is:
ceph osd setcrushmap -i <filename>

I'm assuming it may be something to do with the IDs but I am not sure.

So then I tried to manually move things around and saw this command:
ceph osd crush move osd.[0-19] root=BucketHDDs (which ODS[0-19] are HDDs)
ceph osd crush move osd.[20-24] root=BucketSSDs (which ODS[0-19] are SSDs)

That was fine but now I am facing an issue with the rules to manage this, plus the crush map currently configured looks sloppy. Is there a way to clean this up?

My current crush map is attached as "current_crushmap.txt"

If I am missing any info, please let me know. My main goal is to have separate pools for SSDs and HDDs

emilhozan · Oct 6, 2019

Hmmm, okay, I tried messing with the ID references and got a bit further. I updated my desired crush map as attached. Can you confirm it will do what my goal is?

I saw the commands:
crushtool -c desired-crushmap-updated.txt -o desired-crushmap-updated.bin
ceph osd setcrushmap -i desired-crushmap-updated.bin

And I am now getting :
Error EINVAL: pool 4 size 1 does not fall within rule 0 min_size 2 and max_size 3

Here is a screenshot:

I feel I'm on the right track, can you please help me get over my hurdle?

emilhozan · Oct 6, 2019

Okay, I'll just keep posting just in case someone may need some more guided help here.

For starters, I took a step back and a break, then came back and made some massive progress.

Anyways, ended up reading this specifically on removing buckets and rules
Has to mess a bit with my current storage pools, removed pretty much everything I could find (this is a test cluster, FYI). This included my Datacenter -> Storage Pools (except the local and local-lvm pools)
Then on a node, I removed Ceph -> CephFS (I have a shared pool here for ISOs to be available on any host)
I did have to remove the MDS during this (used this wiki link)
Moved to Ceph -> Pools, removed all here except the local and local-lvm
Got my crush map fairly cleaned up but not quite to where I want to.

Currently my latest crush map is attached as "updated_cleanedup.txt" and am trying to get rid of the default bucket, as well as the two extraneous rules calling that bucket (CephRule-[SSD|HDD])

I feel like I'm almost there, just gotta verify if my bucket syntax is correct.

For example:

rule Rule-SSD {
id 4
type replicated
min_size 1
max_size 10
step take BucketSSDs class ssd
step chooseleaf firstn 0 type host #[***Read below)
step emit
}

*** Does this lline need to be changed for the "type host" parameter? Since I removed my host declarations / nodes that held 4x HDDs and 1x SSD, that's why all OSDs were auto assigned. I'm thinking I need to make a new type? And define it in the "#types" section, and then change each rule to that same type?

Thanks for helping out, I understand this is more of a ceph thing at this point but the fact that PVE only allows one pool (from what I read somewhere, anything more requires CLI, which I don't mind) I just wanted to get some help getting this fine tuned.

Alwin · Oct 7, 2019

IIUC, then you want device class based pools. If so, see the link below.
https://pve.proxmox.com/pve-docs/chapter-pveceph.html#pve_ceph_device_classes

As you do not have any hosts in your crushmap, the 'type host' in the rule will probably not work. And in the end the data will be distributed across all OSDs, irregardless other failure domains (eg. hosts). This means, the copies Ceph makes (according size/min_size) will be placed on different OSDs on the same node. Any failure to any node will result in data loss.

emilhozan · Oct 9, 2019

Man, looking back now, I was way overcomplicating what was needed to get done. It was fun learning more about ceph and the crush map but the way the OSDs were assigned, it was a simple act of creating two rules to manage each device type. Once I figured out the terminology, things started clicking!

Thanks again.

To be clear and for future readers, to have two pools (1x for SSD and 1x for HDD):

Those are the only two rules you need. If you need something more extravagent, there are work arounds. The main issue I found was that even if I manually edited the bucket declarations, they'd revert after rebooting. I read an option that didn't make sense at the time but it may have been the answer. I couldn't find it anymore when searching that other day. But I figured it out with the above rules and the default buckets and device types.

Thanks for your support, PVE Support Staff!

Search

Search

[SOLVED] CEPH Crush Map Help!

emilhozan

Member

Attachments

emilhozan

Member

Attachments

emilhozan

Member

Attachments

Alwin

Proxmox Retired Staff

emilhozan

Member