RECOMMENDED SETUP FOR HA/CLUSTER CONFIGURATION

Abhijit Roy

Member
Jul 23, 2019
50
2
13
42
Hi All,

I want an expert advise to setup HA/Cluster to minimize downtime as much as possible, can anyone suggest me on this in respect of hardware(Server, Storages) Networking as well as software like which kind of filesystem is recommended over here, for an idea we have several customized applications with databasaes, mail servers for which we want this HA kind of setup, kindly consider minimum downtime and 0% latencey.

Kindly suggest minimum standard configuration and catagory of hardware so we can plan our budget accordingly, and redundency in all aspects.

Also what is the exact role of replication in this case or for what it is exactly used can anyone suggest?
 
As you probably suspect already, the answer will be "it depends".
To properly advise you one must take into consideration:
- Specific applications and roles that will be virtualized
- Number of vCPU needed in normal operations and after growth
- Amount of RAM needed
- Disk Capacity needed day one vs a year from now
- Hardware vendor availability in your region
- Your network capability today vs what it should be in new design
- Many other small considerations that a good Partner will consider
- Last but not least Budget

Sure, you can get some advice on the forum, but if you are serious about building a resilient production environment - I suggest working directly with a company or system architect who does this for a living.
There are some partners listed here that you can reach out to : https://www.proxmox.com/en/partners/reseller

We can also advise you on Compute/Network portion if you are interested in combining it with our Storage.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
  • Like
Reactions: alexskysilk
@bbgeek17 To be more specific please find below points.

1. On an average memory utilization is quite high for my application, it is alwas more than 50%
2. Disk IOPS also quite high, I generally use sas now I am planning for SSD.
3. In respect of network we already have 10g switches.
4. In respect of shared storages what file system do you suggest?
5. If redundancy in storage hardware level also require what file system (Ceph/Gluster etc..) do you suggest
6. Lastly for redundancy on entire setup with minimum downtime is HA is the better solution or you can suggest any better option/alternative.
7. We want to build up a production ready resilient and stable setup, so budget is not constrain for us, we just want to be ensure in all aspects before starting of procurement of hardware.
8. We currenty have approx 8TB live data mixup of all categories, if it will help you to suggest more better way.
9. Use of vcpu is always between 10 to 20% in working hours
 
Last edited:
you want to buy servers or rent from any datacenter ?

if you rent: get 4 standard server, adjust to your needs local storage (SSD, NVME), RAM and multiple NICs, at least 10gb, more better), for HA i would recommend Ceph (its really easy to work with and it does everything automatically).

play a couple of hours with proxmox, its not very difficult. get some hints from here, and cluster is ready.
this servers are pretty standard, nothing fency
 
I will bye server and storage, so you are saying ceph is better option if configured it in storage level, as I planning to use server for holding proxmox and shared storages for vm, and ideally it will be 3 node/server and 2 storage hardware for redundancy of data, please suggest if you have any comments on this
 
Note: Plan for when things go wrong.
So, UPS units, how much uptime do you want to aim for, and the appropriate settings/configuration to ensure Proxmox shuts down as expected.
During this time of shortages, it is crucial to ensure you have spares:
Spare HDDs/SSDs.
Perhaps an extra RAM stick or two.

How many spare parts you keep is entirely dependent on your use case, but the above areas cover the likely culprits.
 
Fencing device is necessary on this case? If yes what kind of?
No, internal PVE fencing works out-of-the-box.

1.) On an average memory utilization is quite high for my application, it is alwas more than 50%
Without having numbers, this statement has no information.

2.) Disk IOPS also quite high, I generally use sas now I am planning for SSD.
Same ... for CEPH I'd go with SSD-only and OSDs on NVMe.

3. In respect of network we already have 10g switches.
I hope on your servers, too. With CEPH, the optimal way is 6 NICs (3x2-Port) and LACP-bond them in three groups (public, cluster interconnect, ceph).

6. Lastly for redundancy on entire setup with minimum downtime is HA is the better solution or you can suggest any better option/alternative.
I'd put the nodes in three low-latency-connected fire compartements.
 
No, internal PVE fencing works out-of-the-box.


Without having numbers, this statement has no information.


Same ... for CEPH I'd go with SSD-only and OSDs on NVMe.


I hope on your servers, too. With CEPH, the optimal way is 6 NICs (3x2-Port) and LACP-bond them in three groups (public, cluster interconnect, ceph).


I'd put the nodes in three low-latency-connected fire compartements.
Can u explain the last point more Brief
 
No, internal PVE fencing works out-of-the-box.


Without having numbers, this statement has no information.


Same ... for CEPH I'd go with SSD-only and OSDs on NVMe.


I hope on your servers, too. With CEPH, the optimal way is 6 NICs (3x2-Port) and LACP-bond them in three groups (public, cluster interconnect, ceph).


I'd put the nodes in three low-latency-connected fire compartements.
He might be thinking of this: https://clusterlabs.org/pacemaker/d.../html/Pacemaker_Explained/_fence_devices.html
 
I will bye server and storage, so you are saying ceph is better option if configured it in storage level, as I planning to use server for holding proxmox and shared storages for vm, and ideally it will be 3 node/server and 2 storage hardware for redundancy of data, please suggest if you have any comments on this
its just a basic config.
i would not consider external storages anymore, its a relict from the past (like vdi - some people still hold on to that sh**). the IT industry is betting now on vSAN, Ceph and simelar storage solutions. it has many advantages, not just the price, also performance, less complex, less hardware which can make problems, easy to use, etc.

but its everybodies own taste.
 
its just a basic config.
i would not consider external storages anymore, its a relict from the past (like vdi - some people still hold on to that sh**). the IT industry is betting now on vSAN, Ceph and simelar storage solutions. it has many advantages, not just the price, also performance, less complex, less hardware which can make problems, easy to use, etc.

but its everybodies own taste.
I did not get it exactly you are telling me not to use external storage hardware or something else, I was exactly planning to use one additional storage hardware with ceph filesystem and another storage hardware for redundancy which will hold vm and two storages will be clone to each other, and 3 servers will be host proxmox only, correct me if I am wrong somehow.
 
I did not get it exactly you are telling me not to use external storage hardware or something else, I was exactly planning to use one additional storage hardware with ceph filesystem and another storage hardware for redundancy which will hold vm and two storages will be clone to each other, and 3 servers will be host proxmox only, correct me if I am wrong somehow.
Well, first of all - you cant use 2 servers for Ceph. You need at least 3 to be in basic supported configuration.
Second, @pille99 is advising you to run HyperConverged, ie both compute and storage on the same hosts. Without any evidence in reality, he states that industry is moving towards hyperconverged... Sure, HC has its place and use-cases. It also has drawbacks, like having to evacuate a host every time you need to upgrade storage component.

@Abhijit Roy no offence but you seem to be at the very beginning of your "infrastructure building" journey and still have a lot to learn about basics. Sure, if you have unlimited time to implement this - experiment and take advice from community. I'd just be selective about who to take advice from. Is a person whose firewalls randomly go on and off and who changes production config without any understanding of the consequences, and then blames the product, really the person you want advising you?

PS in 2018 proxmox team wrote a paper on best ceph config, stickied on the forum frontpage, that shows that 100G is recommended for dedicated storage replication, its even more true today than 5 years ago.
 
i am not telling you, just pointing out there are better solutions as storage (but its also personally preferences)
Well, first of all - you cant use 2 servers for Ceph. You need at least 3 to be in basic supported configuration.
Second, @pille99 is advising you to run HyperConverged, ie both compute and storage on the same hosts. Without any evidence in reality, he states that industry is moving towards hyperconverged... Sure, HC has its place and use-cases. It also has drawbacks, like having to evacuate a host every time you need to upgrade storage component.

https://www.nutanix.com/info/converged-vs-hyperconverged-infrastructure
and here
https://www.fortunebusinessinsights.com/hyper-converged-infrastructure-market-106444

expected market to 2029
https://www.maximizemarketresearch....-hyper-converged-infrastructure-market/23577/

so, plz, keep your mouth shut. debunked
 
so, plz, keep your mouth shut. debunked
whether you believe the "white papers" posited by companies who sell those solutions or not does not doesn't automatically prove or disprove anything, its just data points. telling someone who has an opposing view to "shut up" usually just means your argument is weak on its own.

In either case, you didnt make your position stronger, and probably weaker.

Sure, HC has its place and use-cases. It also has drawbacks, like having to evacuate a host every time you need to upgrade storage component.
you're doing it wrong :) what exactly do you mean by this?! One of the best selling arguments for scaleout file systems is that you never actually need to replace storage components. I have clusters in the field with drives dead years ago, and will not be touched till the whole node is deco'd.

If you mean you need to evacuate a node for maintenance, this is true for any clustered env, not just HC.
 
whether you believe the "white papers" posited by companies who sell those solutions or not does not doesn't automatically prove or disprove anything, its just data points. telling someone who has an opposing view to "shut up" usually just means your argument is weak on its own.

In either case, you didnt make your position stronger, and probably weaker.
everybody can conclude what ever they want. thru the fact, compared 5 years ago, more and more companies offer HCI based Solutions, allow a conclusion that HCI coming more and more - dont you agree ?. (i just copied the first 2 links from google, you can read further if you want). Also Dell, ibm and HP offer more and more HCI Hardware solutions, Azure too - but they are all wrong, and you are right ! everybody is entitled to see and understand what they like.
sorry, for the "shut up". that guy provoked me the 3th time. he just spils out his poisens (... without evidence) and after i bring (even its marketing talk, marketing doesnt do anything without seeing a potential - or have you seen a marketing campange for an "non_existent_market" ?), i am his "favorite technology evangelist/sales". @pveuser113 please dont ever talk to me again. thx

anyway. that is not the subject. @Abhijit Roy ask for some Recommendions and not for flaming the post.
Sure, storage has its point which i only see in huge Fileservices, but 95% of all VMs offering one service each and the VMs are below 200GB (Windows), Linux are much more smaller, also more and more container and POD images (openshifts) are coming, these technologies are at the risen - dont you agree ? which would be the ideal conditions for HCI - dont you agree on that as well ?

on the end, it all depends what the majority of your services are and your budget. this are only my thoughs, you guys can think whatever your experience and expertice tells you - its the funktion of a forum.
 
Last edited:
everybody can conclude what ever they want. thru the fact, compared 5 years ago, more and more companies offer HCI based Solutions, allow a conclusion that HCI coming more and more - dont you agree ?
First- a disclaimer. I run HC configurations in multiple sites, with good success. That said, the promise of HC has never materialized since the fad days of 5-10 years ago. You mentioned a few companies and solutions- let me point out some things.

Nutanix:
1681863483998.png

Self explanatory.

Dell/HP/IBM- please, for giggles, call their enterprise sales and try to buy their HC offerings. go on. I'll wait.
vmware- I tried to run vsan in production. I really did. Its almost like the vmware SBU was forced to make a "me too" product and force their retail customers to beta test it. tldr- dont.

Here's the truth: the promise of opex savings based on converged modular capex just never materialized. Unless you have the skill, talent, and manpower to properly spec, design, and operate HC you're not saving any money, and worse- you may not hit your operational requirements. The problem isnt HC in itself, its the false expectations set by their vendors sales and marketing, which is why it was always slated to be a fad.
 
  • Like
Reactions: pvps1 and pille99

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!