Hardware requirements advice for more than 100 VMs

thenemesis584

New Member
Jan 28, 2019
16
1
3
31
Hello guys,

I need an advice.

Currently I have 8 computers, just regular PC-s with similar hardware specs:

I5 or I7 CPU (2GHz or 3GHz)
32GB DDR4
120GB SSD

Each of these computer is running 16 VMs with configuration:

1.5GB ram
CPU 2 sockets 2 cores
20GB drive
CentOS 6

And every CentOS is running only one Java application.

So, I would like to buy a real server that can run for about 125 VMs or containers, and i prefer HP ProLiant but don't know what CPU and RAM to use in this case
I'm still new to Linux, networking and virtualization, need one machine to replace this 8 machines

My opinion is that I need CPU with 32 or 64 cores, or maybe better 2 cpu-s with 32 cores, but now sure for RAM
 
Sounds like you overprovisioned your storage considerably as 20Gx16 is 320G. That could get you into a bad situation pretty quickly.

You definitely are going to need something with dual sockets and lots of cores. Maybe start with 12x 16G dimms for 192G of ram. Then you could expand if its a standard two socket system (24 dimms).

Your also going to need a SSD array of some sort as well if your used to them running on SSD already.
 
Sorry, every machine have 500GB SSDs, not 120GB

It's not problem in storage, CPU-s and RAM are a bit difficult for me.

I don't know how will CPU work with 100 VM-s or containers, don't even know what to use in this case, what is better, VM or container?
 
Sorry, every machine have 500GB SSDs, not 120GB

It's not problem in storage, CPU-s and RAM are a bit difficult for me.

I don't know how will CPU work with 100 VM-s or containers, don't even know what to use in this case, what is better, VM or container?

Sounds like your ok on the disks then.

It really depends on your workload and what your trying to do. I prefer the isolation of KVM and running my own OS, containers will use the kernel of the host so your a bit limited there.

We currently run HP DL 380 Gen9/Gen10's with 386G of ram and have roughly 60 VM's on each front end. Our workload and requirements are different, but I do think you could push a host like this to 100 VM's pretty easily depending on the workload. Don't forget to put some thought into the storage.
 
Ok, so here is more info

I have created one VM and installed CentOS 6, network is private so IP is something like 10.xxx.xxx.128
I have installed Java JRE8 and I'm running it
When I finish configuration of the VM, then I do a cloning, so every VM is same, difference is only in IP, MAC and Java app that is running in it

Now, Java application is a client, it is parsing data of devices that are connecting to it, and every connection is a thread

This is htop from one VM

1569328401572.png

and this is htop from it's host machine

1569328492285.png
 
As adamb said, it really depends on the load which depends on the application and it's load pattern.

You should first see what load types are produced, here CPU, Network and (storage) IO need to be considered.
Memory is fixed and should not be over-commited.
Regarding CPU, what do the Java Apps do? Constant computation (htop screenshot would suggest that this is not the case) or more like an occasional request/computation. If it's occasional you could over-commit CPU cores (to a certain limit), else I'd ensure that there are as much as physical CPU cores as the sum of possible running virtual machines CPU cores.

Network/IO depends again on what the App does. Does it produce network traffic? Does it needs to read and/or write from/to storage? And if, how much average bandwidth does it requires?
 
As adamb said, it really depends on the load which depends on the application and it's load pattern.

You should first see what load types are produced, here CPU, Network and (storage) IO need to be considered.
Memory is fixed and should not be over-commited.
Regarding CPU, what do the Java Apps do? Constant computation (htop screenshot would suggest that this is not the case) or more like an occasional request/computation. If it's occasional you could over-commit CPU cores (to a certain limit), else I'd ensure that there are as much as physical CPU cores as the sum of possible running virtual machines CPU cores.

Network/IO depends again on what the App does. Does it produce network traffic? Does it needs to read and/or write from/to storage? And if, how much average bandwidth does it requires?

Ok, i understand that, but as I said I'm new to all of this and it is very hard for me to determine what hardware I should use.

As for Java, Java's purpose is to parse data that is received from GPS devices and sends results to remote database. So, it is occasional request/computation because I don't know what device will connect and for how long that connection would last.

Here is Iotop and Iftop screenshot

1569331202350.png

1569331242029.png

ofcourse, these values are changing, but you'll understand it I hope
 
Ok, i understand that, but as I said I'm new to all of this and it is very hard for me to determine what hardware I should use.

As for Java, Java's purpose is to parse data that is received from GPS devices and sends results to remote database. So, it is occasional request/computation because I don't know what device will connect and for how long that connection would last.

Here is Iotop and Iftop screenshot

View attachment 11925

View attachment 11926

ofcourse, these values are changing, but you'll understand it I hope

Are you ok with having all your eggs in one basket? With your current setup, if you loose 1 host, your still operational. If you loose this new host, your 100% down.

I definitely think a standard 2 socket system with 192G of ram would be a good start. Use ram dimms big enough to hit 192G without populating all the slots so you have more room for growth.

Or you could come up with some type of shared storage and run multiple front ends for redundancy and scaling.
 
No, of course it will not be one machine, I will buy two machines, the other will be backup machine

Size of storage is not that much important, it just need to be SSD-s for speed, the size is minimal just where I can keep the logs that are not older than 6 months
 
No, of course it will not be one machine, I will buy two machines, the other will be backup machine

Size of storage is not that much important, it just need to be SSD-s for speed, the size is minimal just where I can keep the logs that are not older than 6 months

If your buying another server, then in my honest opinion you should consider some type of shared storage. You don't need to go full out HA. Then you can make use of both servers and have redundancy.
 
  • Like
Reactions: PlOrAdmin
Well my plan is to buy two servers, but I it depends on the finance, for start it will be one.

When I buy another than I will set it up to work as backup machine, that is redundancy, not backup for data or something else, we just misunderstood each other

Now, just to determine what kind of hardware to buy and should I go for VM's or LXC

Since I wll have same OS, maybe is better to go with LXC
 
Size of storage is not that much important, it just need to be SSD-s for speed, the size is minimal just where I can keep the logs that are not older than 6 months

If you really clone all your containers from one, it would be great if you could use linked clones, so is the storage requirement really minimal.

Why do you need so much VMs for computation? If every machine has work to do, you are hopelessly overcommited if each of your 100 VMs has 2 CPUs. Even with two 64-core Ryzen, you have "only" 128 cores.

As it has been pointed out before, KVM has the advantage of using KSM, LXC can't, so you can work with less memory if all VMs are almost identical.
 
If you really clone all your containers from one, it would be great if you could use linked clones, so is the storage requirement really minimal.

Why do you need so much VMs for computation? If every machine has work to do, you are hopelessly overcommited if each of your 100 VMs has 2 CPUs. Even with two 64-core Ryzen, you have "only" 128 cores.

As it has been pointed out before, KVM has the advantage of using KSM, LXC can't, so you can work with less memory if all VMs are almost identical.

As I already mentioned, Java is creating thread for each established connection, take a look screenshot of htop in post #5
And because of that, it is using CPU resources, so I've limited the number of devices in java, but it never happened that all devices are connected at the same time.

But, the scenario that happens sometime is that VM stopped working, just freezes, and let say that VM is shutdown for 5-6 hours. So, when this happens, the data is on hold and accumulating. When VM is started then all the data (huge amount of data) needs to be parsed + PostGreSql data, and that is overload for CPU. In this situation CPU is 100% and machine is very slow, need some time to compute all of this data that accumulated over the time.

In near future, number of devices will be a much bigger, that means that the computation time will be bigger, so I need to prepare for that.

As I already said, my problem is not RAM, I will manage with that
I need to determine what type of CPU to use, need to decide should I go with LXC or VM
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!