Guidance on Shared Storage

troycarpenter

Renowned Member
Feb 28, 2012
105
9
83
Central Texas
I have been fighting I/O performance issues on our Ceph server for some time. Sometimes the VMs I/O performance is so bad that I have to move the VM image to a local drive in order to get performance back. I'm now exploring other shared storage methods. Running Proxmox 7.1-11. When using the Ceph server as the VM storage, the average load is 6, which drops to 0.5 when switched to local storage.

Hardware
3 Ceph Nodes: 24 x Intel(R) Xeon(R) Gold 6128, 10 x Toshiba MG04SCA20EE 4TB HDD (Ceph Data), 2 x Samsung MZ7KM1T9HMJP-00005 (Ceph DB).
5 Compute Nodes: 80 x Intel(R) Xeon(R) Gold 6230, Single system SSD. There are 7 other empty drive bays.

Networking is all 10Gb between nodes. The cluster is running about 80 various VMs, most I/O intensive things like database operations.

Our first solution was going to be to replace the 30 Toshiba HDDs with SSD, but the quote from our vendor was more than we could get into the budget. So now I'm looking for another shared solution using the compute nodes' empty drive bays.

Any suggestions on what solutions I should look into? I'm thinking of GlusterFS, but want some thoughts before going down that path.
 
I wouldn't mix ceph and gluster and, to be honest, I also wouldn't ditch ceph.
Of course the HDDs are the problem, but why do you think another cluster file system would make this any better?
Maybe you can start the replacement with 4 SSDs per node and puth a separate pool on them for the io heavy stuff?
 
Last edited:
I wouldn't mix ceph and gluster and, to be honest, I also wouldn't ditch ceph.
Of course the HDDs are the problem, but why do you think another cluster file system would make this any better?
Maybe you can start the replacement with 4 SSDs per node and puth a separate pool on them for the io heavy stuff?
I was just considering that last suggestion. I will see if I can do that, and see if it improves the performance.