Proxmox with 48 nodes

Nathan Stratton · Oct 27, 2025

I know in the past that the recommended max number of nodes in a cluster was 32, but is this still the case? My boxes are all dual E5-2690v4 with dual 40 Gig Ethernet. I would like to have one cluster with 48 nodes, but is that a bad idea? Should I go two with 24 nodes?

alexskysilk · Oct 27, 2025

so before any answer would be applicable...

why? what is your use case?

Also, 40g for cluster traffic is effectively the same as 10g (same latency.) you should be fine, but depending on the REST of your system architecture its will likely not be enough as your cluster gets larger (more prone to contention issues.) Not in the sense that the individual links isnt "fast" enough, but insufficient separation of traffic.

Lastly- you might want to consider your design penny wise and pound foolish. Broadwell isnt the most power efficient architecture in 2025. If you're designing something of actual size, you may be better off with less nodes of higher power/performance density. It will cost more to deploy but will pay for itself in a matter of months (18-24 is common) in power and cooling savings.

Nathan Stratton · Oct 28, 2025

Yes, latency is the same for 40/10, but with dual ports and VLANs for traffic separation, I thought I would be ok. As for why, the hardware was reclaimed and that is what we have... You're right about Broadwell, but again, it's what we have, and I'm not so sure about the cost savings. We compared the E5-2690v4 to EPYC 9575F, based on spec numbers, we can absolutely do it with fewer servers, but surprisingly, the power numbers are not as far apart as I thought.

alexskysilk · Oct 28, 2025

so your entire purpose for any cluster is "because I can?"

doesnt really matter how you configure your clusters for that.

alexskysilk · Oct 28, 2025

Nathan Stratton said:
We compared the E5-2690v4 to EPYC 9575F, based on spec numbers, we can absolutely do it with fewer servers, but surprisingly, the power numbers are not as far apart as I thought.

E5-2690v4 is 14c@2.6GHz, 135W.
Epyc 9575F is 64c@3.3GHz, 400W.

even if we ignore the MUCH newer process node, WAY faster memory, PCIe generations and just counted each at equal IPC:
AMD (64*3300)/400= 528 instructions per watt
Intel (14*2600)/135= 269.

you would literally need half as many. I suppose I dont know what you thought...

Nathan Stratton · Oct 28, 2025

2690v4 is actually 22 cores, and you're neglecting the power of the FANS, GPUs, Hard Drives, etc. But you're right, you need less than half as many, but the power consumption is not that much different in the two workloads. Also, when you factor in the price of the used E5-2690v4 and the new Epyc 9575F systems, the break-even is a LOT longer than 18-24 months.

gfngfn256 · Oct 28, 2025

Nathan Stratton said:
2690v4 is actually 22 cores

Not AFAIK.

Intel® Xeon® Processor E5-2690 v4 as shown here:
CPU Specifications; Total Cores 14, Total Threads 28

VS:

AMD EPYC™ 9575F, as shown here:
# of CPU Cores 64, # of Threads 128

Neobin · Oct 28, 2025

Nathan Stratton said:
2690v4 is actually 22 cores, [...]

https://www.intel.com/content/www/u...690-v4-35m-cache-2-60-ghz/specifications.html

Nathan Stratton said:
But you're right, you need less than half as many, [...]

https://www.cpubenchmark.net/compare/2780.2vs6317/[Dual-CPU]-Intel-Xeon-E5-2690-v4-vs-AMD-EPYC-9575F

Or maybe only a quarter (10-12) is already enough?

alexskysilk · Oct 28, 2025

Neobin said:
Or maybe only a quarter (10-12) is already enough?

A dual socket system with 2x Intel Xeon E5-2690v4 has the potential for 72800 "cpu units" of performance (more with turbo+hyperthreading but lets leave that for now)
a dual socket system with 2x Epyc 9575F has the potential for 422400 "cpu units" (same comment applies) and is roughly twice as power efficient. a single system can replace almost SIX of the Xeon one- with faster memory, more memory channels, and faster busses for everything from networking to storage.

technology doesn't stand still. The Xeon part is nearly 10 years old.

waltar · Oct 28, 2025

But there's no answer to the real question until now ... maybe someone would like to build 48 nodes out of AMD EPYC™ 9575F ...
what would be the actual limit or how would that be reachable with which eg corosync tuning etc ... that's in the room ...

alexskysilk · Oct 28, 2025

waltar said:
what would be the actual limit or how would that be reachable

The reason you cant find the answer is because its not something you can answer in a vaccum. as I alluded to above, it depends on just how dependable the network is, and how spammy/sensitive the service using it is.

"conventional wisdom" has been that you don't want to climb beyond 32 nodes for pve use. I would say that unless there is a USE CASE to challenge this limitation, I dont bother investing the time and effort to figure it out.

bbgeek17 · Oct 28, 2025

Hi @Nathan Stratton and all,

You need clear guidance here: do not do that unless you have a very compelling reason to.

a) Your hardware is discontinued and past the end of service, which significantly increases the likelihood of component failure.

b) As the number of virtual machines grows, the pmxcfs payload becomes larger and more demanding to synchronize across nodes.

c) Software updates introduce additional complexity and coordination challenges at this scale.

We work with several customers operating at this scale, including those who have tested beyond 32 nodes. None of them found significant value in deploying a single, monolithic cluster. You will be far better off with separate failure domains for fault isolation, simpler management, and easier maintenance.

I always encourage people to ask themselves: "How much of my infrastructure can I tolerate losing if there were an unexpected failure?" If that answer is not 100%, start dividing.

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

waltar · Oct 28, 2025

Thanks, good answer and makes sense so far.

gurubert · Oct 28, 2025

If you need more than 32 nodes for compute maybe you have outgrown Proxmox VE and should look at larger systems like OpenStack.

floh8 · Oct 29, 2025

gurubert said:
If you need more than 32 nodes for compute maybe you have outgrown Proxmox VE and should look at larger systems like OpenStack.

or xcp-ng that supports until 64 nodes.

Maximiliano · Oct 29, 2025

Hello,

We have seen clusters of around ~24 nodes in production. In our experience this can work without fine-tuning if they follow Corosync best practices (see e.g. [1]) and the network latency (of Corosync's network) is small enough and stable.

For bigger clusters than that, fine-tuning might be necessary. We are currently working on guidance on how to work with bigger clusters. For the time being I would recommend to split this into smaller clusters.

The Proxmox Datacenter Manager might help managing multiple clusters.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_cluster_requirements

CRCinAU · Oct 29, 2025

If you're gunna be running Proxmox on 48 nodes, use your commercial support to lodge a ticket to get exact advice on your setup.

You do have a support subscription given that Proxmox VE is core to your operation, right?

yangsm · Monday at 15:22

My corosync network is a 1000M independent network card. Can it support a PVE cluster with 12 Node nodes? What should be noted when adding 12 nodes? What is the maximum number of nodes that can be reached in a 1000M network corosync PVE cluster at present? Thank you.

Maximiliano said:
Hello,

We have seen clusters of around ~24 nodes in production. In our experience this can work without fine-tuning if they follow Corosync best practices (see e.g. [1]) and the network latency (of Corosync's network) is small enough and stable.

For bigger clusters than that, fine-tuning might be necessary. We are currently working on guidance on how to work with bigger clusters. For the time being I would recommend to split this into smaller clusters.

The Proxmox Datacenter Manager might help managing multiple clusters.

[1] https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pvecm_cluster_requirements

Johannes S · Monday at 17:53

yangsm said:
My corosync network is a 1000M independent network card. Can it support a PVE cluster with 12 Node nodes? What should be noted when adding 12 nodes? What is the maximum number of nodes that can be reached in a 1000M network corosync PVE cluster at present? Thank you.

The answer to that question is in maximilianos answer you quoted. What isn't clear there?

yangsm · 2025-11-18T12:59:58+0100

He didn't write whether corosync is a 1000M independent network or a 10000M independent network. If too many people on the forum give feedback on a 1000M corosync network with more than 10 nodes, problems will occur? I am planning to build a network of 12 nodes. Is it feasible to have an independent gigabit network? Thank you

Proxmox with 48 nodes

Well-Known Member

Distinguished Member

Well-Known Member

Distinguished Member

Distinguished Member

Well-Known Member

Distinguished Member

Distinguished Member

Distinguished Member

Famous Member

Distinguished Member

Distinguished Member

Famous Member

Distinguished Member

Renowned Member

Proxmox Staff Member

Renowned Member

Active Member

Distinguished Member

Active Member

We value your privacy