Basic ZFS vs Ceph Question

Asano

Well-Known Member
Jan 27, 2018
55
10
48
43
Unfortunately my Google skill was not high enough to get this seemingly basic question for my ceph vs ZFS decision answered:

When I add new nodes with better disk performance (which is my current case as I replace a cluster server and the new one will be the first with NVMes) to the pool, would ceph always be "as slow as the slowest node" when there is a replication policy which requires the data to be present on at least one other node? Or can I configure it similar to ZFS replications where I maybe can "allow it to have some lag to other nodes" at the cost of some data loss in the case of failure?

Thanks for any insights!
 
This is a complicated question - but for writing, yes, in essence you don't want the client (the VM) to think the data is written until all the replicas have been written to disk. But it depends on the cache policy of the client (send it and forget it - or - wait until io system has fully confirmed that everything is committed to disk).

For reads, the opposite is essentially true, data can be served from any disk, allowing the aggregate speed to be quite high in some scenarios.
 
  • Like
Reactions: Asano
Thanks for your answers.

My takeaway now was (and this what I did in the cluster in question) to go for ZFS in an unhomogeneous cluster where some nodes may have significantly slower storage and where I can live with data loss of few minutes in case of a crash/failover. For future ceph experiments and clusters which maybe are not totally homogeneous but quite and I do want near real time sync it may be worth it to add some caching mechanic between the VM and Ceph which could allow fast writes to some degree without waiting for all cluster OSDs.
 
For future ceph experiments and clusters which maybe are not totally homogeneous but quite and I do want near real time sync it may be worth it to add some caching mechanic between the VM and Ceph which could allow fast writes to some degree without waiting for all cluster OSDs.
Even with caching, the IO will land on the OSDs and only once written the acknowledgment comes back. That is necessary to guarantee that the data has been written out for all copies.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!