Hardware selection advice - blade/node-server cluster

I am a newbie to proxmox - have been playing for a few months on an e-bay refurb Dell R820 (details in profile if useful) on a 1GB LAN(@home) which is connected to a remote site2site VPN on 100MBit FTTH.
Although I have set up many PC-level machines, I am new to enterprise gear, so I'd like to rather ask actual experts before I make stupid decisions.

I am tasked with setting up an OCR "farm" and some hosting.
The OCR farm needs as many cores as possible, so I am considering getting a Dell C6220 4-node box (8x8C+512G RAM).
(I assume) I'd set it up as a
  • proxmox cluster
    • node 1...."router/load balancer" using HAproxy
    • nodes 2,3,4... LXC/VM's for the OCR app(s).
  • alternatively I could use all nodes for the OCR farm, and put the L/B on the R820
  • I can also add external disk-storage to the R820 for capacity

Is this an appropriate method, or am I way off base?

In short...
  1. what would be a sensible hardware platform?
    1. hp/Dell/SuperMicro? (I saw a recommendation that indicated hp Proliant may be better, but I honestly have no idea.)
    2. I'm based in South Africa - our currency value is crap (like 20:1), and importing things is slow, messy, expensive & govt. controlled, so I have to get the purchase part almost right on the 1st attempt
  2. Is proxmox the right tool?
Thanks (& sorry if i have put this in the wrong place)
 
Even though I am working with multinode systems a lot I personally would avoid them. Limited availability typically makes spare parts and so on more expensive.
I'd search for 1u or 2u standard server. More options to add interfaces, extensions and easier to re-purpose if things change.

I also have no clue why you would run a dedicated hardware for ha proxy. Build it out as a VM guest cluster. Far more efficient.

Supermicro gear is decent. Running that at home and have worked with it in the past.
Dell, hpe and supermicro are all valid choices.
I'd search for a DL 380 g9 or dell 730. If space is limited go for a dl360 g9 or 630.
Supermicro I don't know out of the top of my head. Too many variants. ;)
 
You are in the designated path. i dont think there is right or wrong here, its more like your use and how much capital you have, and ofcourse the availability. Given the market its more like "availability" at core of decision making.
For us, back about 6 months we looked at everyone and ended up with supermicro, having it priced decent and given we used it before, it checked the box for us. We build a ha cluster on it and using it in prod ever since.
The key is to plan your cluster for future too, so for your specific requirement if you are not comfortable owning the decision, it might come back and bite you. So maybe get professional help to design a system.
 
Even though I am working with multinode systems a lot I personally would avoid them. Limited availability typically makes spare parts and so on more expensive.
I'd search for 1u or 2u standard server. More options to add interfaces, extensions and easier to re-purpose if things change.

I also have no clue why you would run a dedicated hardware for ha proxy. Build it out as a VM guest cluster. Far more efficient.

Supermicro gear is decent. Running that at home and have worked with it in the past.
Dell, hpe and supermicro are all valid choices.
I'd search for a DL 380 g9 or dell 730. If space is limited go for a dl360 g9 or 630.
Supermicro I don't know out of the top of my head. Too many variants. ;)
Thank you for the info and ideas.
Do you mind if I ask which devices you have used and where they gave hassles? Primarily, our tests show that the ocr requires cores, so I'd like to get an 8+ socket machine for this, and depending on $/CPU, will put in the highest #cores that is affordable. eg. 4x16cores vs 8x8cores vs 1x64core

I have looked at R730, but it only has 2 sockets, so even on a 22 core CPU I can't get as many as I'd like.
The cost per core on the node/blade boxes seemed to be ideal, but I will reconsider if spares are a problem.
I will probably do this for the L/B etc
  • R820
    • add an H8xx card + MD12xx/MD14xx
    • possibly set up a NAS/storage VM (omv/truenas) with a passthrough of the H8xx card so that I can let that VM use all the drives for ZFS
    • set up the load balancer as LXC/VM
and then add an appropriate but as yet undecided high-core box for the OCR parts (any specific recommendations are welcomed)
 
Do you mind if I ask which devices you have used and where they gave hassles?
First the multinode system cannot be split. Means it only can be in on place. Rack Datacenter.

Secondly typically these systems have limited CPU options due to density constraints. Other option is limited environment specs, also due to heat considerations.

Adding additional interface cards is mostly impossible and again limits the possibility to react to needs in all ways.

We had to power off the chassis (shared fan and PSU) for a reason I do not remember. This meant powering down a whole system, as we were unable to shut off nodes in a dedicated way (shared power).

Most of the time there was no reliability issue. It is all related to the options of expansion.

Where they have some strengths: density. Neithercableing is significant reduced nor anything else in my personal opinion.

our tests show that the ocr requires cores,
Cores or will clpckspeed gain you something as well?
E.g. can the OCR really make use of cores / threads or would you achieve the same with a higher clockrate?
8+ socket machine for this
4x2 sockets is not really an 8 socket machine. So forgive me my sceptism here. If really vpu is everything you might even be better off doing a physical install on bare metal. Virtualization is great, but it adds overhead.
The cost per core on the node/blade boxes seemed to be ideal
I can't make that judgement. In the end it is your choice. I can only offer some experience which might be valuable.
appropriate but as yet undecided high-core box for the OCR parts (any specific recommendations are welcomed)
If 4 socket per machine is something of interest you might want to look into the old and opteron 6300 series. Those had up to 16C IIRC, so up to 64 real cores in one system. Those should be fairly cheep depending their age.
 
Thank you again for the update - very little can beat experience when making decisions like these.
First the multinode system cannot be split. Means it only can be in on place. Rack Datacenter. Secondly typically these systems have limited CPU options due to density constraints. Other option is limited environment specs, also due to heat considerations.
I run the set up from home, which will be less than ideal to say the least
Adding additional interface cards is mostly impossible and again limits the possibility to react to needs in all ways.
The system is quite specific, so if any additional cards became necessary, we'd put that functionality into a different machine.
Our tests on a particular job with ~25000 pages put the server (4 core) at ~100% CPU for almost 4 hours (~1.5s per page).
When we tested on an 8core machine, that went closer to ~1s/page, hence the assumption that more cores is better - not very scientific, but a start point.
We had to power off the chassis (shared fan and PSU) for a reason I do not remember. This meant powering down a whole system, as we were unable to shut off nodes in a dedicated way (shared power).
Most of the time there was no reliability issue. It is all related to the options of expansion.
Where they have some strengths: density. Neithercableing is significant reduced nor anything else in my personal opinion.
noted
Cores or will clpckspeed gain you something as well?
E.g. can the OCR really make use of cores / threads or would you achieve the same with a higher clockrate?
I tried setting up a bunch of rPi4(8GB) devices, but the performance compared to x86_64 machines was a bit dismal. This was part of theorising that using "separate" nodes each with 16+C each would be better. It also allows that if a glitch occurs, only the relevant nodes is affected and the others can continue.
4x2 sockets is not really an 8 socket machine. So forgive me my sceptism here. If really vpu is everything you might even be better off doing a physical install on bare metal. Virtualization is great, but it adds overhead.
I can't make that judgement. In the end it is your choice. I can only offer some experience which might be valuable.

If 4 socket per machine is something of interest you might want to look into the old and opteron 6300 series. Those had up to 16C IIRC, so up to 64 real cores in one system. Those should be fairly cheep depending their age.
I will look at these for sure - thanks again
 
When we tested on an 8core machine, that went closer to ~1s/page, hence the assumption that more cores is better - not very scientific, but a start point.
What where the clockspeeds? What was the CPU type?
Alone the CPU family and clockspeed may make a differece. So you can easily jump to a wrong conclusion.

Arm is in a lot of workloads less powerful.
More efficient in terms of power consumption though
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!