Issues creating CEPH EC pools using pveceph command

Superfish1000 · Friday at 17:29

I wanted to start a short thread here because I believe I may have found either a bug or a mistake in the Proxmox documentation for the pveceph command, or maybe I'm misunderstanding, and wanted to put it out there. Either way I think it may help others.

I was going through the CEPH setup for a new server that I wanted to configure with an Erasure Coded pool with a k=2, m=1 config that had a crush-failure-domain set to the OSD level. I haven't done this in a bit, and forgot most of how I did it last time. I did remember there being a few commands that were needed in order to establish a EC pool and a replicated metadata pool, and that they needed to be added as storage using another command that specified this.

I initially located an online thread about using the erasure-code-profile option in the native CEPH tools in order to establish a profile, then apply it using the native CEPH create commands. Having only a vague memory I opted to try this and it all seemed to be going smoothly. Unfortunately I ran into some other issues there with the pool saying it didn't support RBD and I ultimately decided to skim the docs again and blow up the pool.

I had previously seen the command,

Code:

pveceph pool create <pool-name> --erasure-coding profile=<profile-name>

but was unable to get it to accept the profile I had created and instead just went with the working CEPH native commands(this was the pool I just had to blow up). Below are the commands I attempted to run before and again just now, and the error that I got.

Code:

ceph osd erasure-code-profile set raid5-profile k=2 m=1 crush-failure-domain=osd
pveceph pool create MainCEPH_EC5-data --erasure-coding profile=raid5-profile

I ultimately just reformatted the command using pveceph instead of trying to use a profile and ended up with the following.

Code:

pveceph pool create MainCEPH_EC5 --pg_num 32 --erasure-coding k=2,m=1,failure-domain=osd

If anyone can explain what I did wrong or let me know if this was actually a bug then I'd appreciate it.

pve-manager/8.3.0/c1689ccb1065a83b (running kernel: 6.8.12-4-pve)
ceph version 19.2.0 (3815e3391b18c593539df6fa952c9f45c37ee4d0) squid (stable)

UdoB · Friday at 18:35

Superfish1000 said:
If anyone can explain what I did wrong

No idea, sorry.

But: let pveceph create that profile! Just tell it what you want. One of my (technically) successful attempts included:

Code:

~# pveceph pool create ec22 --erasure-coding k=2,m=2                       
created new erasure code profile 'pve_ec_ec22'
pool ec22-data: ...

Without any pre-preparation.

gurubert · Friday at 19:22

Why do you want to create a Ceph pool (or even use Ceph at all) in a single Proxmox server?

UdoB · Friday at 19:27

gurubert said:
Why do you want to create a Ceph pool (or even use Ceph at all) in a single Proxmox server?

Ooops - I didn't realize that this may be the case.

@Superfish1000 : you might want to read this one: https://forum.proxmox.com/threads/fabu-can-i-use-ceph-in-a-_very_-small-cluster.159671/

Superfish1000 · Saturday at 19:01

gurubert said:
Why do you want to create a Ceph pool (or even use Ceph at all) in a single Proxmox server?

I had a few reasons in my case. Primarily I wanted to have the ability to expand my overall storage pool more easily than I could with the other methods I was thinking of by swapping drives with larger drives. Right now the system I am working with has 4 x 1TB drives, but I would like to have the option to upgrade that to 4 - 8TB drives in the future.

In the past I have used a NAS software that used BTRFS, and I greatly appreciated the flexibility to add or remove from the storage pool or even to change the RAID level on the fly. Unfortunately I have run into no end of issues with BTRFS in general and don't want to repeat them. I have used CEPH a bit and it seems like it will be more workable in this respect while having similar behavior, and this is a low stakes situation so I figured I'd try to hack it together.

My second reason is that I have a 4 node cluster with CEPH installed where it is supposed to be communicating over a 40GB IPoIB connection, and the performance is, frankly, abhorrent. It's definitely not an optimal use case, but I'm having an extremely hard time identifying a legitimate reason why it's so incredibly bad. I have wasted many days trying to troubleshoot and test it, and I also wanted to use this as a bit of a test case to see if I can learn more about CEPH and try to figure out why it's so bad on that pool.
Even in this severely handicapped use case I'm able to achieve a stable 150MB/s write speed to the pool with four, old, cheap, used 1TB HDDs while my pool with significantly better HDDs is consistently achieving less than 50MB/s write speed.

While I think it goes without saying, I would also note that I fully understand that none of what I am doing is advisable, best practice or even very sane, but I like experimenting and the setup that I ended up running into this behavior on interested me.

My highly suspect mad scientist experiment aside, I primarily wanted to know if I had made a mistake on the command itself or if this was indeed a bug in the pveceph command since I think this could help others either way.

Search

Search

Issues creating CEPH EC pools using pveceph command

Superfish1000

Active Member

UdoB

Distinguished Member

gurubert

Distinguished Member

UdoB

Distinguished Member

Superfish1000

Active Member