Suggestions for optimizing total DRAM cost?

jb_wisemo · Jan 30, 2026

Hello forum,

[Edits done: Clarifications added to counter misunderstandings in original replies, also changed Proxmox to PVE]

I am considering building a new production (not lab) Proxmox VE (PVE) cluster with HA for both running VMs and the underlying storage. Given the current DRAM shortage, I have to consider how to optimize the design in terms of total DRAM purchase for the entire cluster.

This is going to be a migration of VMs from another system, so the number of VMs and their sizes are not in question or open for redesign. Only the design of the new cluster hardware and how PVE is installed on it.

I am currently considering two options:

A. Two PVE nodes and a third non-proxmox machine acting as tie-breaker to help the cluster choose the machine that will be responsible for production execution if the network links between the two nodes fail. In terms of DRAM, this means that each Proxmox node will need enough physical memory to run all the VMs while the other node is down or isolated, thus the DRAM per node should be (sum of all VMs + clustering overheads such as ZFS overhead), total cost is thus 2 x (sum of all VMs) + 2 x (cluster overheads)

B. Three PVE nodes with no tiebreaker outside the cluster. In terms of DRAM, this means that if one node fails, the two remaining machines could split the VMs between them, thus the DRAM per node should be (sum of half the VMs + clustering overheads such as ZFS overhead), total cost is thus 1.5 x (sum of all VMs) + 3 x (cluster overheads).

Option B, thus theoretically saves 25% of the VM HA memory cost, but adds 50% to the cluster overhead memory cost, and also adds the cost of a third physical machine.

Official PVE "system requirements" were clearly written when DRAM was cheap, suggesting, without stating reasons, that an additional 1GB RAM/TB disk be added to the clustering overheads. Question is how much this can be safely squeezed for cost, perhaps to 0.5GB/TB or 0.25GB/TB corresponding to 2 bytes/disk block or 1 byte/disk block. Fundamental issue is how much of the stated overhead must be in memory, versus how much is just cached data that can be reloaded from disk/regenerated on the fly versus how much is somehow forced to be kept in physical node RAM at all times.

Another question affecting the purchase calculation is if PVE HA requires complete copies of all running VM memory on the node that would take over if the running node crashes, or if PVE uses a mechanism that just keeps the memory snapshots on redundant disks until the moment of failover. Obviously, if PVE reboots VMs after their active physical node fails, then no DRAM is needed on the node that will potentially run the VM after failover. My calculations for scenario B above assume near zero physical DRAM reservation for potential failover of VMs running on other nodes, thus if each of 3 nodes use y/3 GB for VM memory each, each node needs y/2 GB memory for VMs, of which y/6 GB will just idle waiting for the arrival of HA reloaded VMs from other nodes, whereas keeping alive VM memory clones would need 2/3 * y GB, of which y/3 GB is idle VM memory clones (y/6 GB from each of the other nodes).

Clarification also stated in a reply below: "Clustering overheads" means all the physical memory on PVE nodes stemming from running the full HA suite of features, including disk related in memory metadata and VM related metadata. For example, if the HA storage mechanism for virtual disks requires an in-memory data structure of 4 bytes per PVE node per 4 KB virtual disk, then this adds an overhead of 1GB/TB . As another example, if the HA mechanism for VMs needs an in-memory data block for each VM the size of the VRAM of a large gaming GPU, such as 16GB, then that adds 16GB/VM . I obviously hope the numbers are smaller, such as 1MB/TB disk overhead and 2MB/VM machine overhead (including 1MB virtual VRAM).

Impact · Jan 30, 2026

PVE needs about 1.5G for the OS and its services. Some of the memory can likely be swapped out without causing issues if needed. KSM and ZRAM can help too. I'm not sure I fully understand your question but PVE does not reserve memory for a VM that doesn't exist. The requirements are not a necessity. Just recommendations to achieve the "best" results. For example ZFS' ARC is limited to 10% by default nowadays.

louie1961 · Jan 30, 2026

I think the answer is less in the hardware and more in how you run your apps. Run as much stuff in docker containers as possible to reduce your memory needs.

jb_wisemo · Jan 30, 2026

Impact said:
PVE needs about 1.5G for the OS and its services. Some of the memory can likely be swapped out without causing issues if needed. KSM and ZRAM can help too. I'm not sure I fully understand your question but PVE does not reserve memory for a VM that doesn't exist. The requirements are not a necessity. Just recommendations to achieve the "best" results. For example ZFS' ARC is limited to 10% by default nowadays.

The point is the totality of RAM needed for a new cluster, and if that will be smaller for 2 nodes or 3 nodes.

I was not talking about reserving memory for VMs that don't exist, but using or reserving memory for HA failover of VMs that do exist on other nodes and may need to suddenly failover to different node in the cluster than where they were running before the failure. The top level presentation of Proxmox to potential users isn't clear how HA works for VMs and thus how many resources are needed to make them actually available across single node failures.

For the "overhead" discussion, it was not about the relatively small amount of RAM used to load the OS, but the RAM used to operate the HA mechanisms for storage etc. in a fully operational HA cluster where many of the virtual machines are constantly making small changes to their state, such as processing new requests according to their purpose (for example, a forum server will process logins and posts) and/or updating log files with such trivialities as remembering when the VM was last in a fully running state (classic example: The MARK lines in syslog every 20 minutes).

Johannes S · Jan 30, 2026

louie1961 said:
I think the answer is less in the hardware and more in how you run your apps. Run as much stuff in docker containers as possible to reduce your memory needs.

How does running a app in a docker inside a vm needs more RAM than running the same App inside a vm without docker?

We are not talking about lxcs here since they are not suitable for running docker, are less isolated than vms ( which might make them a hard no due to compilance or security requirements) and are not an option for non-linux workloads

jb_wisemo · Feb 9, 2026

As there have been no on-topic answers yet (just two misunderstandings of the question), how do I bump this thread to be seen by forum members with better answers?

fba · Feb 9, 2026

jb_wisemo said:
Official Proxmox "system requirements" were clearly written when DRAM was cheap, suggesting, without stating reasons, that an additional 1GB RAM/TB disk be added to the clustering overheads.

It is not clear, what you mean by clustering overheads. Proxmox states, that additional memory is required, when using Ceph or ZFS, see https://www.proxmox.com/en/products/proxmox-virtual-environment/requirements
As you didn't mention it, what is the planned storage solution?

Another question affecting the purchase calculation is if Proxmox HA requires complete copies of all running VM memory on the node that would take over if the running node crashes

If a node crashes, resources managed by HA will be started on the remaining node by booting them from their disk. The running state and everything else not written to disk is lost. This is different for planned migrations, of course.

t.lamprecht · Feb 10, 2026

jb_wisemo said:
As there have been no on-topic answers yet (just two misunderstandings of the question), how do I bump this thread to be seen by forum members with better answers?

This is the community forum, if you're not content with the replies you get for free of volunteers to your thread, you might want to check out the paid support offerings.

jb_wisemo · Feb 11, 2026

t.lamprecht said:
This is the community forum, if you're not content with the replies you get for free of volunteers to your thread, you might want to check out the paid support offerings.

Well, I was asking if there was a way for the forum software to not bury the question under the weight of useless chatter. As for paid support, note that this is a pre-sales situation, which kind of changes the economic balance.

Johannes S · Feb 11, 2026

jb_wisemo said:
As for paid support, note that this is a pre-sales situation, which kind of changes the economic balance.

Then it's propably best to contact office@proxmox.com or a Proxmox Partner. As a volunteer I won't bother anymore trying to help somebody who thinks the effort of me or other community members is "useless chatter" and feels he is entitled to "bump the thread to get better answers". You got answers just not what you wanted to hear.

jb_wisemo · Feb 11, 2026

Johannes S said:
Then it's propably best to contact office@proxmox.com or a Proxmox Partner. As a volunteer I won't bother anymore trying to help somebody who thinks the effort of me or other community members is "useless chatter" and feels he is entitled to "bump the thread to get better answers". You got answers just not what you wanted to hear.

Well it was mostly the non-answers by the others that and the need for you and me to rebuke it that constituted chatter that seemed to make the forum software behave like this was a solved thread.

Search

Search

Suggestions for optimizing total DRAM cost?

jb_wisemo

New Member

Impact

Distinguished Member

louie1961

Well-Known Member

jb_wisemo

New Member

Johannes S

Distinguished Member

jb_wisemo

New Member

fba

Renowned Member

t.lamprecht

Proxmox Staff Member

jb_wisemo

New Member

Johannes S

Distinguished Member

jb_wisemo

New Member

We value your privacy