Production PVE Cluster Decisions

ScottDavis

New Member
May 23, 2024
26
4
3
I have been testing a three node cluster with HA and both ZFS and CEPH (full mesh) and I'm seeing some stark differences in disk speed testing between CEPH and ZFS. ZFS is over twice as fast, however in fairness we are only using 1GB nics in the test environment that I know limits CEPH.

To the point I'm thinking for 3-4 nodes in production running web and sql servers with ZFS and use HA and replication.

Can CEPH meet or exceed ZFS performance with 10GBe full mesh?
 
I have been testing a three node cluster with HA and both ZFS and CEPH (full mesh) and I'm seeing some stark differences in disk speed testing between CEPH and ZFS. ZFS is over twice as fast,
By "HA with ZFS" did you mean ZFS replication? If so, then, unlike Ceph, it is asynchronous. Essentially you are only writing locally to a disk. With Ceph you are writing to all disks over the network. So you are gated by network, disk, and CPU in each of the nodes.

A lot of your testing will depend on the type of disks used (HDD, SSD, NVMe). The network is important, and so is the CPU. How you write data is critical as well (dd if=/dev/zero is not a good way to test).
Performance testing/comparison is very nuanced.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: Kingneutron

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!