Genoa Cluster (CPU Bottleneck with Storage)

Ramalama

Well-Known Member
Dec 26, 2020
668
120
53
35
I bought 2 Server, that i want to use in a Cluster, those 2 Server are exactly identical, thats the Specs of one Server:
1x 9274F
12x 64GB DDR5 4800
8x 3,2TB Micron 7450 Max (VM Storage)
2x PM9A3 960GB M.2 110mm (Proxmox OS + ISO's)
2x 25G Interfaces (Intel E810)

So in the end it's a Low-Core/High-Clocked Proxmox HA Cluster with as much as Memory Speed (460GB/s) and almost as Much as Storage speed allows.
Sure you can get more Storage speed, but we don't have unlimited money...

The Servers will be configured simply in an HA-Cluster, with one Quorum Device (VM) on another Proxmox HA-Cluster.
----------

Im getting the Servers on Monday, but both 9275F CPU's seems still not being shipped, so i can still maybe exchange it to an 9474F on Monday, which will kill my budget :-(

The issue that i fear:
However, my fears are the CPU, because i looked lately into so much videos and readed a lot, that the CPU will bottleneck the Storage, and i need the Storage Speed, because we have a lot of Very Storage intensive VM's and Crap Storage Based Databases (Not PSQL/MySQL/MSSQL, Propritary Crap that is similar to Sqlite with tons of Files), Like if Sqlite had a multiple files, instead of one file...

I Initially thought anyway to go with ZFS Raid10, so thats why i choosed to go with 8 Disks, because Raid 10 doesn't need any Parity Calculation and is the fastest Option, i think.
And 8 Disks are anyway almost the limit of the Budged (Don't forget, i build 2 of those Servers), i can go with 12, because 4 Slots are still free and usable.

However, i go with ZFS, because well, ZFS is easy and i use ZFS everywhere, just never used it with that fast sort of drives.
I would definitively get more Performance with mdadm out of the Storage, no question, but mdadm + lvm isn't really as comfortable as ZFS.
I Have no Clue about Cephs + I think 2x25G (lacp) Network wouldn't be enough for Ceph, in the end even while it is lacp, it will still be at a single 25G speed, maybe an xor Bond would be a lot better for Cephs, to get nearly 50G.
But 50G is still Crap, thats like 6 Gigabyte/s... (The Raid10 Array, can do theoretically ~25 Gigabyte/s)

1th Question: What would yo do as Storage System?
2nd Question: What about CPU, will it bottleneck, should i get the 9748F and sell my car?
3nd Question: If ZFS, should i disable ARC Cache completely, is it any beneficial? (The memory is in 12 Channel mode still extremely fast with 460GB/s)

Cheers :)
 
1) With two servers you cant do ceph so I would go for a dual node zfs cluster (you still need a third vote via external quorum-device)
2) You should be fine with that cpus, but it depends of course how much you wanna use for vm/ct ressources
3) zfs arc is only for zfs-reads, so those will be quicker if you have a arc. I would NOT disable it completly but limit it to according to your needs, it will take up to 80% of ram by default if you dont change anything.
 
1) With two servers you cant do ceph so I would go for a dual node zfs cluster (you still need a third vote via external quorum-device)
2) You should be fine with that cpus, but it depends of course how much you wanna use for vm/ct ressources
3) zfs arc is only for zfs-reads, so those will be quicker if you have a arc. I would NOT disable it completly but limit it to according to your needs, it will take up to 80% of ram by default if you dont change anything.
About the 3rd point.

The issue with the ARC idea is, that with limited ARC, i will get a lot of Cache Misses.
Cache miss will introduce additional Latency, as it does anyway.

On Normal Servers, where the speed of disks/ssds are no where near to memory speed, ARC is definitively a win.

But if you have already like ~50gb/s raid10 read speed, using ARC will introduce additional delay/Latency, at least in my thoughts.

Or because the memory is still 460gb/s fast with 12 channels, so a lot faster, im getting into a territory where using ARC vs Disabling ARC, will make no difference.

That's where im not 100% Sure, if ARC is still beneficial at those SSD speeds, or if it introduces Latency and because of the introduced latency + still hella fast memory, it's getting useless.

I have in the end to benchmark this, or everything to find out what makes most sense configuration wise.
Because probably no one can answer that question.

Cheers :)
 
Okay some Answers to my Own Questions:

1. Disabling ARC, is not a good idea at all, it should not be done and is not even really possible.
-> ZFS needs ARC, at minimum if you have Compression ON, on the pool. Since for the Compression the Blocks gets Copied to Ram anyway, if they aren't in Ram, which will hurt Performance twice without ARC.
--> BUT, i was not wrong with my assumption, ARC won't be beneficial it will indeed hurt Performance in my situation, but with ZFS 3.0 there will be Probably an Direct option for a pool, to bypass ARC if the Storage is extremely fast.

2. To get the most speed and be able to Clusterize Storage with 2 Nodes, the best solution would be in my Case to use DRBD (Linstor) with LVM as FS.
-> With RDMA Capable Cards. E810 supports Rocev2, so im fine here, i will simply connect one Port Back2Back between 2 Servers.
--> Another solution, which will be definitively better, is to buy some used Mellanox ConnectX 5 100GBE Cards and Connect those Back2Back with Infiniband (RDMA), thats a Cheap Solution. The Cards are Cheap and the Transceivers are Cheap either.
Since there are only 2 Servers, i don't even need a Switch, which makes 100GBe or faster possible. Since. the Switch has only 2 100GBE Ports that are used already for Uplink and MLAG/Stacking :-(

But i will anyway make some Banchmarks and Provide here.
ZFS vs DRBD+LVM

Cheers :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!