[LAB] Ceph architecture, performance and encryption

hardek

Member
Dec 27, 2021
14
1
8
31
Hi All

I am trying to improve my home LAB and move from local LVM to NFS/ISCSI or one of distributed file system. I have compared GlusterFS, CEPH and other object storage solutions like minio/seaweedfs with NFS/iSCSI and the most appropriate for me seems to be CEPH due to compatibility with Proxmox VE, continuous development and the most future expansion with more disks/nodes (scalability) and distributed nature, which eliminate performance issue.
Now I am in the process of developing an architecture for the future CEPH cluster and I am not sure if my idea will be a good one. My idea is to completely virtualize the CEPH cluster (each virtual machine will have different CEPH role: OSD, Manager, Metadata server etc) instead install it to physical Proxmox node. The below I introduce what it looks now and what is will look like.

Now:
  • Proxmox node 1 (PC, role: NAS)
    • Architecture:
      • unencrypted storage: physical disks -> RAID 0 or 6 -> LVM group -> LVM volume -> [ any VM -> virtual disk -> filesystem (EXT4) ]
      • encrypted storage: physical disks -> RAID 0 or 6 -> LVM group -> LVM volume -> [ any VM -> virtual disk -> LUKS2 -> filesystem (EXT4) ]
  • Proxmox node 2 (Laptop, role: compute only)
  • Proxmox node 3 (PC, role: NAS, now it is dead and will be replaced in near future)
    • the same architecture like Proxmox node 1
What I want to achieve:
  • Proxmox node 1 (PC, role: NAS)
    • Architecture:
      • unencrypted storage: physical disks -> RAID 0 or 6 -> LVM group -> LVM volume -> [ specific CEPH's role VM -> virtual disk -> LVM group (forced by CEPH) -> LVM volume (forced by CEPH) -> filesystem (EXT4) ]
      • encrypted storage: physical disks -> RAID 0 or 6 -> LVM group -> LVM volume -> [ specific CEPH's role VM -> virtual disk -> LVM group (forced by CEPH) -> LVM volume (forced by CEPH) -> LUKS2 / built-in dmcrypt -> filesystem (EXT4) ]
  • Proxmox node 2 (Laptop, role: compute only)
  • Proxmox node 3 (PC, role: NAS, now it is dead and will be replaced in near future)
    • the same architecture like Proxmox node 1

What is reasons of above:
  • Flexibility:
    • I would like to change storage solution in future (if it will be needed) to other solution by create new VMs and migrate data
    • I would like to test other solutions and compare them with CEPH
  • Isolation:
    • It should improve security
    • Additional resource control for each VM
What I worried about:
  • Performance:
    • When I will add additional layer consisting of another LVM on the virtual machine it might reduce performance
So my questions here:
  • Does my idea worth implementing? Will this have a significant impact on performance or causes additional problems (about which I don't know yet :) )?
  • Can my idea use in production? (of course, taking into account that individual nodes will be dedicated for CEPH)

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

My second consideration is related with CEPH and encryption. I would like to achieve full encryption similar to previously used with local LVM storage + LUKS2 (it allows me to implement encrypted and unencrypted volumes as well), which prevents from read data after a potential theft devices. I have tried to read and understand CEPH documentation, but I am not sure. What I was able to read from the documentation:
  • encryption can be set on volume level
  • encryption can be set on RBD image level
  • encryption can be set on LVM side (dmcrypt, only LUKSv1 supported)
So my questions are:
  • What are the best practices of CEPH encryption?
  • Is there any possibility to use LUKSv2 instead LUKSv1?
  • Does it possible to use LUKSv2 on VM side level, which act as client for CEPH storage volume?

Thank you in advance for any help and advices.
 
Hi All

I am still looking for the best solution, so let me refresh the topic.

I update my first question: Does my idea worth implementing? Will this have a significant impact on performance or causes additional problems (about which I don't know yet :) )?
I have tried to compare performance between plain partition and LVM. The results are between 1-2% in read and write. I assume, that in the case of my CEPH deployment performance will drop by max 4% (due to double LVM)?

Regarding the rest questions, I am not able find answers and similar case/tutorial step by step. I would be very grateful if someone could share their experience regarding CEPH in production and CEPH's encryption to achieve the best possible results in performance and security.

Best Regards
 
Hi @hardek,

I don't know enough to answer many of your questions, but from my dev/homelab playing, the 3 nodes is the bare minimum for the ceph to work, and I'm not sure you can achieve what you would like without storage on all three nodes, eg your laptop.

The advantage of a distributed fs is to have the storage spread across multiple physical hosts, like when you get the new third node set up, but with only the two hosts storing data you will still have issues accessing data if one host is down. I am using ceph so that I can have some HA vms that can migrate quickly, or that I can manually migrate quickly reboot a physical host.

I came across your post looking for info on using iSCSI within Ceph. Good luck with your research and setup.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!