Possible conflicts running ceph storage on proxmox nodes

EuroDomenii

Renowned Member
Sep 30, 2016
145
32
68
Slatina
www.domenii.eu
OFFICIAL WARNINGS:

http://docs.ceph.com/docs/master/start/quick-rbd/

You may use a virtual machine for your ceph-client node, but do not execute the following procedures on the same physical node as your Ceph Storage Cluster nodes (unless you use a VM).

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

Tip

DO NOT mount kernel clients directly on the same node as your Ceph Storage Cluster, because kernel conflicts can arise. However, you can mount kernel clients within virtual machines (VMs) on a single node. Additionally, mounting client kernel modules on a single node containing a Ceph daemon may cause a deadlock due to issues with the Linux kernel itself (unless you use VMs for the clients)

PROXMOX DOCS

https://pve.proxmox.com/wiki/Ceph_Server

For smaller deployments, it is also possible to run Ceph services directly on your Proxmox VE nodes. Recent hardware has plenty of CPU power and RAM, so running storage services and VM/CTs on same node is possible.

This articles describes how to setup and run Ceph storage services directly on Proxmox VE nodes.



PARANOIA CHECK

Intuitively, the client is inside VM, not working at host node, in order to conflict with ceph storage library. So we fit in the safe exception.

Is the same valid with lxc ?

https://pve.proxmox.com/wiki/Storage:_RBD

krbd

Access rbd through krbd kernel module. This is required if you want to use the storage for containers.
 
http://tracker.ceph.com/projects/ceph/wiki/How_Can_I_Give_Ceph_a_Try said:
We recommend using at least two hosts, and a recent Linux kernel. In older kernels, Ceph can deadlock if you try to mount CephFS or RBD client services on the same host that runs your test Ceph cluster. This is not a Ceph-related issue. It’s related to memory pressure and needing to relieve free memory. Recent kernels with up-to-date glibc and syncfs(2) reduce this issue considerably. However, a memory pool large enough to handle incoming requests is the only thing that guarantees against the deadlock occuring. When you run Ceph clients on a Ceph cluster machine, loopback NFS can experience a similar problem related to buffer cache management in the kernel. You can avoid these scenarios entirely by using a separate client host, which is more realistic for deployment scenarios anyway.

that warning refers to the above FAQ (and to single node test setups? not quite sure). given that our kernel was released after that wiki page was written, it's safe to assume that we fall into the "recent kernel" category. of course separating the ceph cluster from the hypervisor cluster is the ideal setup - but it is simply not possible in a lot of cases because of space, energy, hardware and/or financial constraints. I haven't heard of any krbd related deadlocks under memory pressure..
 
  • Like
Reactions: EuroDomenii

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!