Newbie question on RAM overprovisioning (8VMs - 64 GB RAM)

IcePlanet

New Member
Jul 8, 2025
13
6
3
Hello I have read lot of information about RAM allocation here, redit, watched youtube videos... Everywhere recommendation is not to overprovision RAM.

What I have is 'special' application, where 8 VMs (Windows based) are collecting data (4GB is perfectly OK), but once per day they are doing calculations that take 16GB of RAM (VM total, including the 4GB for normal collection) for 5 seconds to 20min. My physical RAM is 64GB.

Tried this scenario at work on ESXi and it worked perfectly, but for my home usage ESXi is somehow 'too big', has compatibility issues with non-server HW...

Tried with Windows Hyper-V and it does not work, If I assign 4GB to the machines and give dynamic allocation up to 18GB, to allow the VM take the 16GB when needed, it crashed, because the Hyper-V was too slow to assign the memory and the application inside VM crashed on not enough RAM shortly before Hyper-V has increased the RAM.

Today evening I plan to try with Proxmox, I have never before used Proxmox. To increase chances it would end up better than with Hyper-V would you please give me advice what to read/prepare to make it success? Points I'm thinking of:

1. It is clear that if more than 2 machines try to make the calculation in the same time it will run out of physical memory. Are there any 'prevention' mechanisms I can utilize, for example compression of memory? Memory sharing? In worst case caching to SSD?

2. Options to fine tune 'strategy' of proxmox on memory assignment/removal? How fast, how big chunks?

3. Should I use balooning or not? I'm not sure, because it allows to 'return' some memory to Proxmox, but I read that getting it back can be problem and lead to crash... (not sure if this is still actual, it was years old post...)

4. Would you advice any specific configuration changes for this kind of setup?

5. Should I even try this ? :-D

Background: The intention to us VM's is based on fact that the SW is poorly written and can not use more than one thread, the calculation process is very slow and currently takes 15-20min, but when tried on the new physical HW (that should host Proxmox from tomorrow) as bare metal installation the calculation takes only seconds, but upgrading all the 8 computers to this HW spec is somehow expensive. Getting more RAM is problematic, maybe 128 can work, but currently not available locally to purchase.
Each VM would have all CPU cores assigned to give as much resources as possible during calculation and free RAM as soon as possible. On physical HW the load is usually under 1%.
 
Everywhere recommendation is not to overprovision RAM.
Yes. That's the only resource type you can not simply over-commit and be fine.

The only way to have 16 GiB inside the VM while the host only needs to supply way less is to use a VM internal swap file. Of course this is the slowest approach possible.

Ballooning might help a bit, but in my recognition it is slow. When one VM has all the memory and another VM wants to get 16 GiB now (quickly, in a single call) it just fails. "Failing" her means that the chance of OOM-killer to kick out that VM with the most actively assigned memory is high. Without asking any further...

For some percent (not a high factor but perhaps 5 to 20 percent) you may look at apt show zram-tools. The idea of a classic swap-file on the host is also not really recommended; I never tested the actual behavior nowadays, but I would not expect it to work fluently.

If possible: just assign as much RAM as needed (your 16 GiB) and let all VMs being stopped. Start one VM only when it is scheduled to do its job. Make sure there is no overlapping timing...
 
3. Should I use balooning or not? I'm not sure, because it allows to 'return' some memory to Proxmox, but I read that getting it back can be problem and lead to crash... (not sure if this is still actual, it was years old post...)
Ballooning might help a bit, but in my recognition it is slow. When one VM has all the memory and another VM wants to get 16 GiB now (quickly, in a single call) it just fails. "Failing" her means that the chance of OOM-killer to kick out that VM with the most actively assigned memory is high. Without asking any further...
I fear ballooning works even worse and simply takes away memory (withough negotiating with the OS inside the VM) from all VMs (if they have the same number of shares) and in this case will only result in all VMs having the same amount (which is too little for the computation apparently).

Maybe KSM would work if all VMs run the same OS and very similar programs? Probably enable KSM always in /etc/ksmtuned,conf. However, this does not help is all (or maybe even more than one) VMs run the memory intensive stuff at once.

If possible: just assign as much RAM as needed (your 16 GiB) and let all VMs being stopped. Start one VM only when it is scheduled to do its job. Make sure there is no overlapping timing...
Therefore, I also think this would be the best strategy. Use the Proxmox API or command line tools and cron to schedule starting and shutdown of the VMs so that they do not overlap. Maybe some kind of fancy suspend to disk or hibernation or snapshot is possible but I would start with simply running one VM at a time.
 
For reference of the mentioned dynamic memory managment and zramswap options:
I concur with UdoB and leesteken that your best bet would be to just run a subset of the vms and schedule their launch and stop them accordingly. This could be combined with a an edit of the vm to give it more RAM before the evening job and remove it afterwards.
If you can't have a downtime I would go with swap inside the VMs, but this will hurt performance obviouvsly.
 
  • Like
Reactions: IcePlanet and UdoB
Followup question:
where 8 VMs...

Does your setup really need eight machines? Or do you have eight users with one VM each? If the latter is true: can't they use the same VM? In that case you could give it 32 GiB to let two users run that 16 GB job at the same time. (Concurrent users need a Terminal Server, a "normal" Windows does not allow this.)

Just thinking...
 
  • Like
Reactions: Johannes S
Thank you for all the information and sources to learn something new. I did not manage to try today, too much work at work...

@UdoB & @leesteken Based on your responses and what I have read it seems like ballooning is not good idea, crossing out

KSM will definitely try out, sounds good, thanks @Johannes S & @leesteken because all machines run the same OS and same software. Unfortunately it is not possible to hibernate or sleep any machines, the SW is quite old and the processing is random time, currently I do not know if it is possible to decrease number of PCs/VMs, I must look into this deeper.

@Johannes S The performance impact on using the swap would materialize only if more than 2 machines start calculation in parallel. This is trade-off for having only one physical HW instead of 8. If/how it will work I must test first.

Most of the time there are NO users, the machines only collect data... When I want to check something, solve some problem I login via RDP, but this is max 1x per month.
 
  • Like
Reactions: Johannes S
Suppose KSM does wonders and can share 2GB of each VM. Then you need 2+8*2=18GB for running all 8 VMs. Each VM that does a processing run takes another 12GB (assuming that the data is unique). That would only allow 3 VMs to process 36+18=54 and leaves Proxmox enough memory for other stuff. I'm not sure if a 4th processing VM would use swap or one VM randomly get killed by OoM. If KSM shares 1GB then 3 processing VMs use 61GB, which might just work as it leaves Proxmox the minimal 2GB and 1GB for file cache. Either way provide some (or a lot) of fast swap. Best to look into a way to prevent too many VMs from processing at the same time, or just test it (how hard is the +12GB and how random is the processing?).

EDIT: Alternatively, can you reduce the memory of the VMs and let the processing run with swap/pagefile inside the VM?
 
Last edited:
  • Like
Reactions: Johannes S
The good news is that now on test installation everything runs on 1st attempt without any complicated settings.

On the bad side the windows installation (moved from physical machine) is extremely slow and has very slow response (lagging). It is not possible for me to differentiate if it is problem of the windows shift to VM or some settings in Proxmox is incorrect. Will try fresh windows VM install, but that must wait for tomorrow.

Would like to thank you for the hints and links, it make my beginnings very smooth. Thanks!
 
  • Like
Reactions: Johannes S and UdoB
Would like to thank you for the hints and links, it make my beginnings very smooth. Thanks!

You might want to read the best practice pages for Windows guests, it's important to have correct drivers and machine settings for sufficient performance:

Please note that there is a recent problem with the CPU type host and enabled CPU security bug mitigations see https://forum.proxmox.com/threads/cpu-type-host-is-significantly-slower-than-x86-64-v2-aes.159107/ and https://forum.proxmox.com/threads/t...-of-windows-when-the-cpu-type-is-host.163114/ for more information.
 
Last edited:
With fresh Win 10 install speed is perfect.
But now I made small test how the memory assignment works. Physical memory is 16GB, made 2 VMs, each got assigned 10GB of RAM, NO ballooning. VirtIO and guest tools installed on both.

When VM1 is running complete 10GB is shown as used in proxmox node summary despite VM1 internal task manager shown 1.9 GB used. When VM2 is started (VM2 task manager showin 3.4GB used) it shows complete 16GB used + SWAP used, but KSM sharing is 0.

Turned off both VMs, activated ballooning on both VMs and set miminal RAM to 2048. Started VM1, then started VM2 and result is the same as without ballooning.

My expectation was that assignment of memory will depend on use. E.g. If VM1 is starting, maybe during startup it goes up to 3-4 GB, then drops to 1.9GB as reported by VM1 task manager (maybe little bit more due to VM overhead).

I followed the Windows install guide as posted above by @Johannes S . My current suspect are the virtio drivers as the proxmon is obviously not getting correct data from VM.
 
  • Like
Reactions: bonnma
When VM1 is running complete 10GB is shown as used in proxmox node summary despite VM1 internal task manager shown 1.9 GB used.


This is working as expected: https://forum.proxmox.com/threads/f...ram-usage-of-the-vms-in-the-dashboard.165943/

When VM2 is started (VM2 task manager showin 3.4GB used) it shows complete 16GB used + SWAP used, but KSM sharing is 0.

I have no idea how KSM actually works but this seems odd, maybe sonebody else has some insights how to get more out of KSM?
 
If I can have more questions:
1. How to check if virtio is correctly installed and doing what it should do?
2. What CPU to setup - currently using x86-64-v2-AES (x86-64-v3 and x86-64-v4 not booting)
3. Enable NUMA or not?
4. Machine q35 is OK?
5. What should be setup in vIOMMU? Have there Default (None) and again changing to anything else results in non booting VM

Error is always the same (when non booting VM present): TASK ERROR: start failed: QEMU exited with code 1
 

But this results in situation where 2 VMs with 10GB (in reality using 2-3GB) per VM can not be run, because they are very laggy, each menu unforlds for 1-2 seconds... because of running from swap.

I was watching video :-D : https://www.youtube.com/watch?v=X-_jY-D7RiI and that has given me hope that ESXI like behaviour can be achievable (allocating memory not based on configuration of given VM, but on real usage.

Edit: Note on KSM - VM2 was created as clone of VM1, so there should be possibility that KSM finds something similar.

Edit2: Running from swap: As written above, each machine needs 16GB once per day for short time (can be anything from 5 sec to 20min). Chance of conflict is small and for the rare situation of calculation in conflict time window it can use swap (this is the price for less physical RAM). But the 'working as expected' configuration means that 24/7 the machines are runnig from swap and this is complicating everything - basically making it not possible.
 
Last edited:
Please share qm config VMIDHERE --current from the node and Get-Service -Name "BalloonService" from inside the VM via PowerShell.
Some more information and pictures showing the issue might also be helpful
 
Last edited:
  • Like
Reactions: Johannes S
Thanks for hints:

This is result of qm config VMIDHERE --current
Bash:
root@proxmox:~# qm config 100 --current
agent: 1
balloon: 0
bios: ovmf
boot: order=ide2
cores: 4
cpu: x86-64-v2-AES
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
ide2: none,media=cdrom
kvm: 1
machine: pc-q35-9.2+pve1,viommu=virtio
memory: 10240
meta: creation-qemu=9.2.0,ctime=1752074116
name: VM1
numa: 1
ostype: win10
scsi1: local-lvm:vm-100-disk-3,cache=writethrough,iothread=1,size=70G
scsihw: virtio-scsi-single
smbios1: uuid=4708b385-4b4d-4602-8d7b-95333e257318
sockets: 1
tpmstate0: local-lvm:vm-100-disk-2,size=4M,version=v2.0
unused0: local-lvm:vm-100-disk-1
unused1: local-lvm:vm-100-disk-4
vmgenid: 176f5b44-8151-4a2d-821c-15c62e65dd37

Result of Get-Service -Name "BalloonService" status: Running

Attached screenshots from my test install, this is very different compared to the video listed above (where I made another screenshot of the video also). Please excuse the strange language on memory summary, but I believe everyone knows what the numbers means, it shows 1.8 GB (or 1.9 GB, but definitely less than 10GB) used, but Hypervisor insist on 10 GB for this machine regardless of usage (as configured in Hardware section).

Edit: Clarification the total is showing 20GB used RAM+SWAP because there are 2x 10GB VM

Two 10GB VMs running.png
 

Attachments

  • VM1 memory usage.png
    VM1 memory usage.png
    42.1 KB · Views: 11
  • VIDEO YOUTUBE.png
    VIDEO YOUTUBE.png
    551.3 KB · Views: 7
  • Balloon service status.png
    Balloon service status.png
    6.8 KB · Views: 11
Last edited:
And here is version with ballooning enabled. IO delays are in average cca 10 percentage points higher, but nothing else changes.
 

Attachments

  • Ballooning enabled 4 to 10 GB.png
    Ballooning enabled 4 to 10 GB.png
    33.8 KB · Views: 7
  • Ballooning config.png
    Ballooning config.png
    11.7 KB · Views: 8
This last picture only shows the node's Summary, not the VM's so I can't compare. Also note that if the node is over 80% (by default) then ballooning will start to kick in. You can set min and max to the same if you want.
 
Last edited:
  • Like
Reactions: Johannes S
Do you happen to have a monitoring tool like PRTG, Prometheus, Zabbix or Icinga installed on the VMs? If not it might be worth to check, how many RAM each of the vms actually uses over a time frame. If it turns out that most of the time the VM doesn't actually use that much RAM it might ( just an idea havn't tested it) be worth to set the amount to a small default value and raise and lower it before the daily task ( given you can narrow down it's tineframe). As far I know raising and Lothringen RAM should be possible without powering down the VMs. I might be wrong on that though.
 
Last edited:
Maybe to formulate my question more precisely: Is PROXMOX able to allocate RAM based on real usage?

Looking at my pictures, the VM memory consumption (IF ! I understand windows reporting correctly) shows 1.8 GB used for applications, 2.3 GB used as cache this totals to 4.1 GB used. I have not made screenshot of VM2, but as it is clone of VM1 we can safely assume the memory usage is similar (4.1 GB), this result in 8.2 GB used (even if we assume the worst case and VM2 will be evil and used all memory available 10GB we have total usage of 14.1 GB).

In summary 2x VM needs cca 8.2 (worst case 14.1 GB). Proxmox overhead seems like 1-2GB? This totals to 10.2GB (worst case 16.1 GB).

As you can see on the screenshot above, the PROXMOX memory usage shows 21.7GB (more than double of expected 10.2 GB), again assuming Proxmox overhead 1-2 GB it leads me to assumption that Proxmox ignores real usage and allocates memory only on configuration of VM hardware section (10GB VM1 + 10GB VM2 + 1-2 GB overhead = 21.7GB).

Can please someone experienced with PROXMOX let me know if allocation of memory based on configuration (NOT on real usage) is intended behavior of PROXMOX or I have wrongly configured my installation?
 
Last edited: