Sharing Storage (Ceph or Gluster) for VM's in a 3 node scenario...

andrea68

Member
Jun 30, 2010
101
0
16
Hi,

I have a 3 dell Poweredge R610 24GB RAM and 6 SAS 300GB 10K rpm each.

I wish to build a cluster (mostly KVM) with shared storage system between this 3 nodes, and I will use internal storage to do. I was thinking to use Ceph or GlusterFS, but I'm not sure what is the best choice.

Each server has 4 ethernet port 1GB and I was thinking to use 2 ethernet port with separate switch to use for storage and the other one (or two) to networking.
Then in internal storage I will use one disk (or two in raid 1?), maybe smaller of 300GB sas for proxmox os and the other ones to build the shared storage.

Do you think I'm going in the right direction?
Anyone has already trying similar solution?
In this scenario what is the best choice, Ceph or Gluster?

Thanx in advance!
 
Feb 14, 2019
77
2
8
Erlangen, Germany
Hi!

In your case i would choose the option to build a cluster with ceph. Here you have the snapshot option available, and i think - thats my experience - that ceph is faster than other scalable network storage solutions, depending on hardware and so on.. thats clear.

You should use for proxmox raid1 with two small ssds, and the 6 sas 300 GB 10k harddrives for ceph - but not in raid - directly connect the ceph harddrive disk to sata port on board.
Two nic´s for the cluster network, maybe with bonding config, and also the other two nics for ceph. In a small environment the 1 GB nic for ceph is ok, but if you go to production or something else, 10GB nic are much better.

best regards,
Roman
 

andrea68

Member
Jun 30, 2010
101
0
16
For only 3 hosts,i would recommend Gluster
Tnx Ness.

I will looking into it.

It's a little cluster we talking about, with slow performance... maybe I will upgrade to 10Gb SFP network card for every host (+ a 10Gb switch), to boost perfomances (if we use a shared storage) ...
There will be few VM's for internal test and use (Win and Linux).

Now I have the cluster with no shared storage whatsoever and all the VM's are saved nightly in a NAS connected to cluster with NFS (very slow but good enough for backup purpose)
It will be nice to have not a really HA environment, but a simple shared storage to improuve migration from node-to-node.

This will be my goal...

Tnx for your advice!
 

andrea68

Member
Jun 30, 2010
101
0
16
and why? maybe a reason why it is "better" than ceph with only three nodes?
i think, ceph is easier to setup, thats my opinion.
I never use Ceph so I don't know if is it's easier or not.
I use just a couple of time Gluster and I'll found very easy to setup but not simple to recover in case of failure of a node...

Ceph seems to me a little bit complicated because use more components (OSD, Networking, Journaling etc...) but maybe more rock solid?

Just asking...
 

andrea68

Member
Jun 30, 2010
101
0
16
ok. if you have more experience with Gluster, you should choose it.
ceph is really easy if you know it.
You should take a look at it anyway; Install Ceph on Proxmox
With ceph you have "both"; block level storage for the VMs in raw format, and file level storage for iso´s, backups,...
I will definetively do.

Do you think that my hardware is good enough for a starting point or I will think to upgrade some components right now, before I start?
 
Feb 14, 2019
77
2
8
Erlangen, Germany
I will definetively do.

Do you think that my hardware is good enough for a starting point or I will think to upgrade some components right now, before I start?
For a start i would say the DELL R610´s are of course ok! Also your maximum RAM is good. Storage... with GlusterFS - i dont realy know the distribution here - you have more Storage then ceph, because with ceph you only have 1/3 of all SAS Disks in GB for VMs, in your case maximum storage for ceph is round about 1 TB (4x300GB (max 6 SAS per DELL - 2 for Proxmox Raid1) x 3 DELL Server = 3,6 TB - Ceph Pool Size 3/2 = 1,2 TB ~ 1 TB for RAW images, VMs).
For GlusterFS as i said, i dont know the maximum TB Storage.

Best regards, roman
 
  • Like
Reactions: andrea68

andrea68

Member
Jun 30, 2010
101
0
16
For a start i would say the DELL R610´s are of course ok! Also your maximum RAM is good. Storage... with GlusterFS - i dont realy know the distribution here - you have more Storage then ceph, because with ceph you only have 1/3 of all SAS Disks in GB for VMs, in your case maximum storage for ceph is round about 1 TB (4x300GB (max 6 SAS per DELL - 2 for Proxmox Raid1) x 3 DELL Server = 3,6 TB - Ceph Pool Size 3/2 = 1,2 TB ~ 1 TB for RAW images, VMs).
For GlusterFS as i said, i dont know the maximum TB Storage.

Best regards, roman
Tnx Roman...

How about network cards?
It's mandatory to upgrade from 1GB to 10Gb?
 

alexskysilk

Active Member
Oct 16, 2015
575
61
28
Chatsworth, CA
www.skysilk.com
and why? maybe a reason why it is "better" than ceph with only three nodes?
ceph is happiest with a greater number of nodes/osds, and with 1Gbit connections your experience may be less then ideal. Gluster is better suited for this size of deployment. If you have the time and inclination, try both and see how either performs on your HW.

How about network cards?
It's mandatory to upgrade from 1GB to 10Gb?
No, its not mandatory, but it will help your perceived performance quite a bit, especially as your VM count increases. With only 3 nodes you can deploy mesh networking which means you wont need a 10g switch.
 

andrea68

Member
Jun 30, 2010
101
0
16
ceph is happiest with a greater number of nodes/osds, and with 1Gbit connections your experience may be less then ideal. Gluster is better suited for this size of deployment. If you have the time and inclination, try both and see how either performs on your HW.
Tnx alexskysilk.
The starting point is 3 node minimum but I don't exclude that we will increase the nodes from 3 to 4 or 5... In time...
It depends.

I will try both systems but at some point I need to decide what is the right direction...


No, its not mandatory, but it will help your perceived performance quite a bit, especially as your VM count increases. With only 3 nodes you can deploy mesh networking which means you wont need a 10g switch.
The cost of a 10g card is very affordable, what concern me is the cost of the switch, configuration, maintenance etc ...
I will looking into it...
 

ness1602

New Member
Oct 28, 2014
27
2
3
Gluster is very easy to setup , and performs okay with 3 nodes(better than CEPH,that is).
 
Feb 14, 2019
77
2
8
Erlangen, Germany
Tnx alexskysilk.
The starting point is 3 node minimum but I don't exclude that we will increase the nodes from 3 to 4 or 5... In time...
It depends.
I will try both systems but at some point I need to decide what is the right direction...
The cost of a 10g card is very affordable, what concern me is the cost of the switch, configuration, maintenance etc ...
I will looking into it...
Please try ceph if you want to plan to increase the cluster with more nodes.. and 10GB nics are not so expensive. But try it with 1 GB, thats really ok
 

Dark26

Member
Nov 27, 2017
81
5
8
42
Have you tried to recover from from a node fail without problems?
I try to, but with some issues...
i think it's depend of the configuration, replica, distribute, number of brick...

for example, with a replica 3 arbiter 1 or full replica 3 with 3 bricks, the recover is easy because the all the data is on the brick itself.

if you have a problem, the data is visible on the storage.

if you have distribute ...
 

andrea68

Member
Jun 30, 2010
101
0
16
Hi,

A quick update on my little project.
I almost completed the 3 node configuration.

Now I have the 3 Dell R610 with 96GB Ram each, double NIC SFP 10GB card each and storage configured in this way:

- 5 disk SAS 10K 300GB on each node
- 1 disk SAS 10K 146GB for proxmox Os on each node

I will use the SFP 10GB network only for the network filesystem.
In this scenario is more suitable Gluster or Ceph?

Tnx...
 
Last edited:

jdancer

New Member
May 26, 2019
12
0
1
49
I don't have Dells but I do have 8-drive SunFires just as old as the R610s. They don't have 10GbE but do have full mesh 1GbE setup https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server

Since you can configure Ceph via the GUI, I suggest you go with that https://pve.proxmox.com/wiki/Manage_Ceph_Services_on_Proxmox_VE_Nodes. Ceph requires the HBA to be in IT mode not IR mode. Plenty of guides online to flash to IT mode.

To speed up IOPS even more, I enabled 'writeback' for the hard disk cache https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines#qm_virtual_machines_settings, and the SAS drives via 'sdparm –s WCE /dev/sdX' http://liobaashlyritchie.blogspot.com/2014/06/how-to-disable-or-enable-write-caching.html. For the CPU, I chose 'host' for the CPU. I ran the 'fio' database benchmark and got over 1,200 write IOPS. The nodes have 15K RPM SAS drives. https://support.binarylane.com.au/support/solutions/articles/1000055889-how-to-benchmark-disk-i-o. I'm still looking to increase IOPS on the cluster without upgrading the hardware.

I also recommend you do a ZFS RAID 1 for the boot drive, so get another 146GB drive to mirror on https://pve.proxmox.com/pve-docs/chapter-pve-installation.html#_using_the_proxmox_ve_installer

And while you are creating VMs, don't forgot to install the QEMU Agent for Linux VMs. I created a VM on the cluster to collect node info and display it in Grafana using InfluxDB https://grafana.com/dashboards/10048 https://pve.proxmox.com/wiki/External_Metric_Server

Let me know if got any more questions. Been working with this full mesh 3-node Proxmox cluster for about a month or so. It's pretty cool.
 
Last edited:
Feb 14, 2019
77
2
8
Erlangen, Germany
Hi,

A quick update on my little project.
I almost completed the 3 node configuration.

Now I have the 3 Dell R610 with 96GB Ram each, double NIC SFP 10GB card each and storage configured in this way:

- 5 disk SAS 10K 300GB on each node
- 1 disk SAS 10K 146GB for proxmox Os on each node

I will use the SFP 10GB network only for the network filesystem.
In this scenario is more suitable Gluster or Ceph?

Tnx...
Nice to hear. Use one NIC for the cluster of the nodes, and the other 10 GB NIC for the Ceph Network. This is what i would do in your scenario.
In my tests in our environment, i have up to 100K IOPS over the ceph network storage.
But don´t use any raid for the 5 300 GB SAS Disks. Maybe a SSD per node for the journal could tune up the performance for the ceph storage, if you choose ceph instead of gluster.

best regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!