Ceph SSD disk performance Kingston vs Samsung

ivanovs.artjoms

New Member
Jan 14, 2024
5
0
1
Hello,

I have 3 servers later planning to add 3 nodes more.
Lenovo sr570
CPU: 2x Xeon Silver 4208 2.1ghz 8c 16t
OS: 2x sata 480gb m.2 ssd mirroring kit
Network: 4x10GBase-T 2x lacp for frontend and cluster and 2x Lacp for ceph cluster and public every where will be individual Vlan.
Switch: 2x Aruba Instant On 1960 12XGT 4SFP+ Switch (JL805A)

I want to use in each node 2x ssd 3.84TB, 6 disks in cluster, question what to chose and real IOPS and bandwidth can get from them? Cant find real results and best configuration.

Disks:
Samsung PM893 3.84TB read intensive, only has 30 000 iops in spec
Kingston DC600m 3.84tb mixed has good write iops in specs
 
Network: 4x10GBase-T 2x lacp for frontend and cluster and 2x Lacp for ceph cluster and public every where will be individual Vlan.
I wouldn't separate it, the concept of CEPH frontend and backend is relevant if your cluster is larger and you really have a lot of load on the links. This will probably never be the case with your setup and the only thing you'll get is increased complexity. In my opinion, 2x 10 G would be enough, but you can also do 4x 10 G. You could also use 2x 10 for CEPH and 2x 10 for cluster and VM traffic.
Switch: 2x Aruba Instant On 1960 12XGT 4SFP+ Switch (JL805A)
In terms of specs, they seem to fit somewhat, but to me it all looks less like Enterprise and more like this cloud world. Personally, I would rather buy switches from Arista or Juniper with MLAG, I have a better impression.

A lot of storage performance also depends on the switches and they should really deliver on the ports. So you should be really sure that this isn't junk.
Samsung PM893 3.84TB read intensive, only has 30 000 iops in spec
Kingston DC600m 3.84tb mixed has good write iops in specs
Personally, I would advise against Kingston, I've only had bad experiences with it. I can highly recommend the Samsung.

I want to use in each node 2x ssd 3.84TB, 6 disks in cluster, question what to chose and real IOPS and bandwidth can get from them?
That's not an easy question to answer. This also depends on how you want to set up the CEPH. For example, whether you take Replica 3 or just Replica 2 or whether you even want to work with EC. Personally, I can only advise you to use Replica 3, because EC definitely doesn't work well with 3 nodes and you shouldn't use Replica 2 at all if the data is important to you.

With Replica 3, however, you have to do the writes on 3 nodes, which potentially costs time, which means higher latency and therefore less performance. So you don't need to expect 10k IOPS, I'm guessing something between 2k - 5k IOPS.

With 2 OSD per node, you should keep in mind that you can only use a maximum of 42.5% per node. If one OSD fails, the other must be able to store the additional data. At 85% you are at the near full ratio, at 95% you are at the full ratio which potentially leads to your cluster coming to a standstill. Therefore, you should consider 85% as the maximum level.
It would therefore be recommended that you plan 3 - 4 OSDs per node. Maybe it's better to have several smaller OSDs, which is significantly better for overall performance and availability than just a few large OSDs.
 
I wouldn't separate it, the concept of CEPH frontend and backend is relevant if your cluster is larger and you really have a lot of load on the links. This will probably never be the case with your setup and the only thing you'll get is increased complexity. In my opinion, 2x 10 G would be enough, but you can also do 4x 10 G. You could also use 2x 10 for CEPH and 2x 10 for cluster and VM traffic.
So your proposal is to make 4x 10gbe lacp? it will work if i will make 1 Vlan for ceph Cluster and Public, 1 Vlan for Cluster network and many Vlans For VM's?
In terms of specs, they seem to fit somewhat, but to me it all looks less like Enterprise and more like this cloud world. Personally, I would rather buy switches from Arista or Juniper with MLAG, I have a better impression.

A lot of storage performance also depends on the switches and they should really deliver on the ports. So you should be really sure that this isn't junk.
These Switches are from HP under brand Aruba for small business, dont have many enterprise features but basic function like Lacp and can be added in stack up to 4 devices.
and looks good:
Switching capacity320 Gbit/s
Throughput238 Mpps
MAC address table16000 entries
Latency (10-100 Mbps)7.4 µs
Latency (1 Gbps)4.2 µs
Latency (10 Gbps)1.1 µs
Jumbo frames support

Personally, I would advise against Kingston, I've only had bad experiences with it. I can highly recommend the Samsung.
What kind of bad experience you had? Does it is good idea to chose read intensive SSD and not mixed. just want to know what to expect from Kingston.


That's not an easy question to answer. This also depends on how you want to set up the CEPH. For example, whether you take Replica 3 or just Replica 2 or whether you even want to work with EC. Personally, I can only advise you to use Replica 3, because EC definitely doesn't work well with 3 nodes and you shouldn't use Replica 2 at all if the data is important to you.

With Replica 3, however, you have to do the writes on 3 nodes, which potentially costs time, which means higher latency and therefore less performance. So you don't need to expect 10k IOPS, I'm guessing something between 2k - 5k IOPS.

With 2 OSD per node, you should keep in mind that you can only use a maximum of 42.5% per node. If one OSD fails, the other must be able to store the additional data. At 85% you are at the near full ratio, at 95% you are at the full ratio which potentially leads to your cluster coming to a standstill. Therefore, you should consider 85% as the maximum level.
It would therefore be recommended that you plan 3 - 4 OSDs per node. Maybe it's better to have several smaller OSDs, which is significantly better for overall performance and availability than just a few large OSDs.

I prefer to do a 2/2 copies. As this is development cluster, so wont have critical VM except domain controller and Gitlab on it and everything will be backed up.

It would be great if i could get these numbers that was stated in 2018 year ceph performance. you think it is not real to get these numbers?

For me ceph is new and i cant find any exact information with setup and numbers to know what to expect in reality.
 

Attachments

  • Proxmox-VE_Ceph-Benchmark-201802.pdf
    272.2 KB · Views: 1
Last edited:
Rarely i've found kingstons that are good, on the other hand pm893 are something of an industry standard for entry-enterprise disks. Yes they are read-oriented but work really great with everything proxmox.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!