Container Density

300cpilot

Well-Known Member
Mar 24, 2019
115
6
58
What metric's do I look at to increase density. We are after the goal of 5-7 watts per container. with 120 CT's running right now we are at 18.33 watts per CT. Our largest expense is electricity so I am open to idea's. Currently the whole system slows down and clients complain if we go over 70 CT's per node.

Also any plans on running ProxMox on the ThunderX ARM chips?
 
I agree that AMD are currently the best, but I was asked to provide a 2 and 5 year plan for future expansion paths. This includes the combining of current virtualization platforms of Nutanix, Vmware and legacy Xen Server from Citrix.

Would it be possible to contact anyone that is running a density of over 100 vm's per node? I need to have an idea of the hardware they are using. Switches too would be great.
 
I routinely run >100 VMs per node; this isnt really a big issue except and until some VMs begin to consume their allotted resources at the same time. Proxmox isnt very good at this and it requires some outside programmatic and manual intervention. The switches I normally run are Arista 7050s for cluster/ceph, and Aruba 1G switches for front connectivity but I dont think this is of actual use to you-

What is your use case? what are you VMs are doing and how much resources (cpu/ram/bw) does each vm require?
 
These are nodes that sit idle 99% of the time. But when one goes active, usually dozens will go at the same time, on the same nodes.

Typical vm/ct is 3 gig ram, 2 cores, 60 gigs diskspace, running Ubuntu server 18 currently. All of the vm's in question are identical in size and purpose. Generally they are for db table look up. Each has a 30gb posgresql db on it. May not be the best design, I'm just the guy keeping someone else's "Baby" alive.

Current (2) nodes are HP dl380, dual 8 core 2.0 ghz cpu, each has 50tb raw storage in external SAS enclosures. They run into slowness at 60-70 CT's running on each of them. These are test servers that I started with a while back, but got forgotten about until it was time to renew our licenses. So now the questions about my report have been asked again... Not like I wasn't busy. Geezzz.

We are in a colo, where space and power are big costs so we want to move approx. 800 vm's from our exiting environment to Proxmox in the least amount of space. Proxmox would be a lot cheaper for the same cpu counts. Plus I like it. We will stay in a colo.
 
I guess what I need to see is what actual cpu's are capable of handling the load. Which server models are currently working? Do we need to go to 4 cpu model servers? We have 10 gig cisco switch of the testing. Production would be on Cisco Nexus 9K's. Bandwidth is really low on the network. I/O on the disks are a under 100 when I have actually caught it. Disks are local and not shared, running ZFS.

When running the guests as vm's I got even less density. As CT's I got about 1/3 more.
 
the physical CPUs dont care what generates the load. If you have idle vms your theoretical limit on the virtual machines is only limited by how much ram you have and how many open files your root kernel allows (this is tunable.) Even if your CPUs are completely subscribed the system wont necessarily crash; your vms will simply slow down as they wait for resources.

The way to size your system in a hyperthreaded node is to aim at ~35% CPU utilization UNDER FULL LOAD. any more and you may start getting slowdowns or crash as system IO processes may start timing out. In the case of an 8 core (16 vcore) node, that would be approx 6 cores worth of load. My typical server has 20C (40vcores) for a ~125 VM load, but the oldest are slated to be replaced with higher density servers in the near future.

As for the network discussion- I dont know what relevance this poses to your question as there was no discussion of required IO latency/throughput.
 
I know this question is like asking how much water will a sock hold.

Thanks for the input, I will be ordering the most cores I can afford.