Install Ceph on dell PowerEdge 720 with perc

tru64guru

New Member
Sep 10, 2021
3
0
1
52
I am new to proxmox , I have 5x DELL PE 720 with 10 internal disks managed by PERC controller.

I created RAID1 and installed proxmox OS on the RAID1 .. I have the cluster created now..

My question is : How to make the other 8 disks on each host visible to ceph ? Do I create RAID0 out of each disk ?

Appreciate feedback

Thanks
 
The Raid 0 are not so optimal. The best way is, set the Raidcontroller in HBA Mode and create a Software Raid for the OS.
 
Hello, we also have a DELL PE R720, with a PERC H710 RAID controller, we are not using ceph, in our case we use LVM and create in the controller menu with ctl + R three RAID1, 1TB SSD for Proxmox, 2TB DATA HDD, and SSD 256GB CACHE and SWAP, to see the DATA CACHE and SWAP disks it is necessary to create the LVM and add them to the storage, we have it like this:
1631344339561.png

In your case as you don't want a RAID1, then you can use the suggested option with HBA mode, and with LVM Thin create the volume you want, it can be something like this:

echo -e "w\ny\n" | gdisk /dev/sdb
echo -e "w\ny\n" | gdisk /dev/sdc
...
echo -e "w\ny\n" | gdisk /dev/sXY
wipefs -a /dev/sdb
wipefs -a /dev/sdc
...
wipefs -a /dev/sXY
pvcreate /dev/sdb /dev/sdc ... /dev/sXY
vgcreate data /dev/sdb /dev/sdc ... /dev/sXY
lvcreate -L VOLUMEN_SIZET -T -c 64k data/data data



and add this:
1631347704726.png

This is our configuration, maybe it is a good alternative to CEPH, we do not know, I hope it will help you
 
Last edited:
Without Ceph or ZFS i recommend Hardware Raid. It’s easier to handle Disk replacements.
 
Thanks for all replies , actually the goal is to use proxmox cluster to build VMs running kubernetes . My understanding is we need unified filesystem like ceph configured (equivalent to VSAN in vmware). Is there alternative that achieve same result ?

Thanks again for help
 
Hello,

you can also use HyperV with S2D besides vSphere. I personally find Proxmox with Ceph more charming as open source.
 
@SkyDiver79 we are going to use proxmox , I meant to ask if I can get proxmox with ceph with underlying disk created by RAID. If I can't use ceph with perc in raid mode , what are the the other proxmox unified storage options give use case ?
 
All HCI storage virtualizations like VMware VSAN, Microsoft S2D and Ceph need the disks natively without a Raid controller.

If you want to use HCI with hardware raid, you have to use a paid solution like StorMagic or DataCore.

If replication is enough for you, you can also use your hardware raid with ZFS.
 
While there are a few alternatives to Ceph, namely GlusterFS, neither is designed for use on a RAID'ed local disk. The distributed file systems are built with distributed data protection in mind. They replicate data for recovery from component failure, which means the blocks are replicated more than once in many cases.
Your local RAID will be replicating blocks locally (except R0 of course) and then Ceph/GlusterFS will be doing it on top of it. It doesnt sound like your disks are SSD so performance will be pretty bad.

If you cant change your controller to HBA only mode, where disks are exposed to OS directly, perhaps 8 R0 would do the trick, if it even allows you to create single disk R0. I foresee trouble doing disk replacements - how will perc react to single disk R0 failure? will you have to take the system down to reconfigure everything? In short, its not recommended...

As @SkyDiver79 mentioned, the other alternatives are commercial products. Stormagic might be an option - it appears to run inside a VM on top of your hypervisor. Personally, I think you want the storage layer to be as close to raw hardware as possible. Datacore server only runs on windows. Based on what you said, I don't think that's an option.

For us, Blockbridge, we'd want to run on a dedicated server platform https://www.blockbridge.com/platforms/. Isolating storage from compute will prevent competition between the two for the resources.


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Last edited:
While there are a few alternatives to Ceph, namely GlusterFS, neither is designed for use on a RAID'ed local disk. The distributed file systems are built with distributed data protection in mind. They replicate data for recovery from component failure, which means the blocks are replicated more than once in many cases.
Your local RAID will be replicating blocks locally (except R0 of course) and then Ceph/GlusterFS will be doing it on top of it. It doesnt sound like your disks are SSD so performance will be pretty bad.

If you cant change your controller to HBA only mode, where disks are exposed to OS directly, perhaps 8 R0 would do the trick, if it even allows you to create single disk R0. I foresee trouble doing disk replacements - how will perc react to single disk R0 failure? will you have to take the system down to reconfigure everything? In short, its not recommended...

As @SkyDiver79 mentioned, the other alternatives are commercial products. Stormagic might be an option - it appears to run inside a VM on top of your hypervisor. Personally, I think you want the storage layer to be as close to raw hardware as possible. Datacore server only runs on windows. Based on what you said, I don't think that's an option.

For us, Blockbridge, we'd want to run on a dedicated server platform https://www.blockbridge.com/platforms/. Isolating storage from compute will prevent competition between the two for the resources.


Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
Just FYI.....StorMagic does have a solution with Proxmox VE to enable you to build a 2 node HCI cluster (with lightweight witness): https://support.stormagic.com/hc/en-gb/sections/14959498948381-SvSAN-Proxmox-7
 
  • Like
Reactions: Falk R.
For those considering to use consumer-grade disks hoping H730 Perc RAID buffer will save them: Don't.

Here is our story:

We built our ceph cluster on top of about 40 QVO 860 drives (we had them already and were not aware about PLP feature).
Initially we went with JBOD and got awful performance due to huge sync times.
We then made use of the H730 Perc RAID card, exposing every SSD as single RAID0 (with single drive) disk. The idea was that 1 GB BBU will smooth over writes latency.

Resulting sync time was acceptable in the beginning, but it was leaky abstraction:

The RAID0+BBU hack we did essentially placed a big shared write cache in front of bunch of SSD drives, thus gaining write latency (in absence of heavy IO). The cache behaves like FIFO, the write requests that got into it can not be reordered.

ceph has builtin logic to give higher priority to client write requests (over recovery IO).

Normally, time sensitive writes have "barrier" flag on. Ceph recognizes such requests and places them in front of "software" write queue.

In case of our RAID0+BBU hack, ceph is not aware about FIFO queue we set up, nor it has a way to reorder writes in it. Thus rebuild IO "stuffes up" the FIFO queue, causing time-sensitive writes to patiently wait for their turn.
Ceph rebuilding involves spreading blocks across multiple drives at once. Shared nature of the FIFO (across all SSD drives on the machine) causes rebuild traffic to quickly stuff up the write buffer.

To mitigate the issue, we reduced ceph rebuilding to minimum speed possible (about 50MB/s):

Code:
ceph tell osd.* injectargs --osd_max_backfills=1
ceph tell osd.* injectargs --osd_recovery_max_active=1
 
The project was doomed to fail right from the start.
QVO are quad level cell SSDs which have worse write values than HDDs. The small 1GB cache can't save that either.

The whole thing could have been reasonably usable if fast NVMe had been added as Bluestore devices.
This would have allowed the metadata to be written and retrieved quickly. However, the user data would still have been slow.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!