ballooning with windows guest issues

Azunai

Member
Sep 28, 2019
17
1
6
34
Hello everyone.
I got pve 6.3-2 and 2 windows VMs running server 2019.
The server has 64gb RAM installed,
i set each windows VM to 8GB min 48GB max
i installed gemu-guest, all the missing drivers from virtio 1.185
i also installed the balloon service on each VM and it is shown as running.
in the proxmox GUI each VM correctly shows how much RAM is currently beeing used
In the VMs config there is min and max RAM setting and the "ballooning device" checked

but here comes the issue, the pve node always allocates the MAX setting of each VM. so even though both VMs only use 3GB ram, the host caps out at 64/64 + swap and starts to be slow and laggy as hell

From my understanding of ballooning, you should be able to run both VMs and only have the host allocate the min setting + what is needed individualy by each VM right?
to start a VM it will always allocate the MAX setting and then free the memory after successful boot ?

Did i miss anything to set this up correctly or is there a known issue?
 
Its hasn't been acknowledged by the proxmox team yet but there is a memory leak.

It seems to only affect windows and might have to do with the new virtio drivers.

Maybe you could try a older virtio version and report back. See https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/

The only workaround for now is disabling ballooning.


I was the first to report it, sadly no response but there are many more with the problem now.

https://forum.proxmox.com/threads/windows-vm-using-more-memory-then-allocated.79368/

https://forum.proxmox.com/threads/proxmox-6-x-consumes-more-memory-than-assigned.79520/

https://forum.proxmox.com/threads/b...s-server-2019-not-working-as-exptected.79583/

https://forum.proxmox.com/threads/ballooning-with-windows-guest-issues.79920/
 
Hi, I read through all those topics, last one sounded familiar
I have not yet experienced rising memory allocations as typical mem leak behaviour. But that 1 post U linked is exactly what I experience. It feels like the host never ever frees memory the guests don't use.
It should release all the overhead it allocated after the guests have started successfully. But it does not.
I tested it on a Debian VM and it works as expected so windows fuckup it is.
I can't recall the issue with pve 6.0 + virtio 1.171and same guest os.
I'll try the older virtios tomorrow.
 
Last edited:
So now i have tried virtio 1.190 (testing) and no change here, same behavior as current (stable)
though i wasnt able to downgrade it to 1.171, for some reason the ballooning driver reappears instantly after uninstalling it. or maybe uninstalling just fails and windowsdoes not recognize
 
Same here.

With Windows guest the balloon seams not working.

In the WM page the memory i reported correctly but the total memory of the host incrise of the same amount of the memory assigned to the VM.

VM on:
1611340975787.png
1611341000801.png


VM off:
1611340906407.png
 

Attachments

  • 1611340969287.png
    1611340969287.png
    30.7 KB · Views: 19
Last edited:
Its hasn't been acknowledged by the proxmox team yet but there is a memory leak.

It seems to only affect windows and might have to do with the new virtio drivers.

Maybe you could try a older virtio version and report back. See https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/

The only workaround for now is disabling ballooning.


I was the first to report it, sadly no response but there are many more with the problem now.

https://forum.proxmox.com/threads/windows-vm-using-more-memory-then-allocated.79368/

https://forum.proxmox.com/threads/proxmox-6-x-consumes-more-memory-than-assigned.79520/

https://forum.proxmox.com/threads/b...s-server-2019-not-working-as-exptected.79583/

https://forum.proxmox.com/threads/ballooning-with-windows-guest-issues.79920/

I am having the memory leak issue with Server 2012R2 and any virito baloon driver I try in Proxmox 6.3.3

If I intall the baloon driver - v 1.9.0 1.85 or as a test 1.71, this particular VM develops a memory leak and uses all of its memory within 1 hour and crashes.

The VM is set as a ballooning device from 4 GB with a max of 16GB

When the balooning device driver is removed, windows memory immediately falls, in this case from 99 to 23%

Does anyone have a solution to this, or is this something that needs to be fixed in proxmox 6.3.3 ?

I can confirm that a Server 2016 edition has the same problem as I just checked into it. Fortunately, it eats all of the ram but doesn't just crash the VM within 1 hour.

This is something that desparately needs to be resolved.
 
Just to sumarize as I think earlier posts are confusing.

1. This issue affects both the windows VM and the proxmox host. However, the trouble is more noticed in the windows VM!

Once the ballon driver is installed in the windows VM and the balloon service is started, a memory leak starts immediately.

This builds until it approaches or exceeds 100% This will crash some windows VMs with a memory error blue screen.

The proxmox host is not crippled, but does suffer from having the machine use all of its ram from the leak.

2. If the balloon driver is uninstalled, windows memory immediately drops back to what it should be. Also, proxmox then automatically assumes the VM is using most of its RAM (which is expected).

(which is why balooning can be so useful when it works correctly. These virtual machines are only averaging around 25% RAM usage, but have to use it all (from the Proxmox host's perspective, since ballooning causes the leak and the driver has to be uninstalled on the windows VM to stop it.

3. As stated by others, the problem does not seem to occur with any Linux host and ballooning (I'm running many types)

@ Proxmox team
Can you please let us know what your thoughts / knowledge of this problem is and if there is a path to a resolution.
 
Last edited:
@ Proxmox Team

Can you please give some response to at least that you have noticed this current bug with ballooing on windows guests?

Even if there is no current path to a solution, better to know where we stand.

For now, ensuring the windows guests do not have the balloon driver is the only solution to the memory leak I can find.
 
I am also having this. Please anyone from proxmox give some response
 
Just to sumarize as I think earlier posts are confusing.

1. This issue affects both the windows VM and the proxmox host. However, the trouble is more noticed in the windows VM!

Once the ballon driver is installed in the windows VM and the balloon service is started, a memory leak starts immediately.

This builds until it approaches or exceeds 100% This will crash some windows VMs with a memory error blue screen.

The proxmox host is not crippled, but does suffer from having the machine use all of its ram from the leak.

2. If the balloon driver is uninstalled, windows memory immediately drops back to what it should be. Also, proxmox then automatically assumes the VM is using most of its RAM (which is expected).

(which is why balooning can be so useful when it works correctly. These virtual machines are only averaging around 25% RAM usage, but have to use it all (from the Proxmox host's perspective, since ballooning causes the leak and the driver has to be uninstalled on the windows VM to stop it.

3. As stated by others, the problem does not seem to occur with any Linux host and ballooning (I'm running many types)

@ Proxmox team
Can you please let us know what your thoughts / knowledge of this problem is and if there is a path to a resolution.
I'll also add that ballooning works correctly with FREEBSD based pfsense right out of the box. The guests use only as much memory as they need and it reported correctly to the Proxmox host.

Does anyone know what is going on with qemu that is causing this with the windows virtio clients? This used to work correctly in earlier versions of Proxmox.
 
So, I switched one server to the pvetest repository in order to see if this is being worked on.

At first no change, but after the latest round of updates, I reinstalled the balloon driver in a Windows 2012 R2 VM and a Windows Server 2016 one and one hour on -- no memory leaks. It would appear this is being worked on and we may be close to this fix coming into the regular pve-nosubscription and enterprise repositories.
 
This has to be an issue in proxmox. I was running into this issue with Talos, and their Iso. It happens during the booting phase of the Kernel, or something along these lines. I wrote a lot about it in their Github. It leaks memory onto the main node and will release that memory after some hours, but if you reboot it will fill the memory up again. The problem is, that if you have 3 over provisioned machines and they all boot at the same time. They will kill the server completly. Not just the vm, the whole server. Because that ram is getting filled up and then it's not released properly. You can read all about it here. But it basicaly overloads the ram. Look at the main node with Htop. or Atop. You will see. Set the windows or Talos, I think they might have modified their package for this bug. But set that for some high value of ram, and notice how it's not releasing that Ram. It isn't just misrepoting in proxmox, it's flooding the system and then it cannot release that flood, that is flooding into Ram.

I wonder if this is problem with Qemu inside the platform. I thought I read somewhere that Qemu was being used for Windows, but also I think Qemu is being used by Talos. They are doing something with it inside the vm.

https://github.com/talos-systems/talos/issues/3054. ou can see the issue here as well.
 
@bbarclay The server that I am running on the pvetest repository is still working with several windows VM and the balloon driver is working correctly. You could try runing talos on a machine updated to the test repository and see if the memory leak is still present. I'm sure others would find it interesting if you could report on results of that.

Also, yes the leaks eat up the memory on the Proxmox host, because it is eating the ram within the Windows client. However, once the ballooning driver is removed, then the leak will disappear in the host (on windows). Then proxmox will treat the machine not as a ballooned one, rather as one that is assigned fixed RAM. It will then block out that the max RAM for the machine regardless of how much windows is actually using.

If switching a server to the pvetest repository fixes the leak, then it's a safe bet that the proxmox team is close to moving this fix into the other repositories which will correct it for any OS.

@proxmox - Any interest in confirming this for us? Or is that kind of response reserved for paying customers?
 
FYI - the balloon driver works in the current non-subscription kernel 5.4.103-1-pve#1

There still is an occasional error that breaks this and requires a host reboot to correct it, but it is relatively stable now.
 
FYI - the balloon driver works in the current non-subscription kernel 5.4.103-1-pve#1

There still is an occasional error that breaks this and requires a host reboot to correct it, but it is relatively stable now.
Thanks - do you know whether its fully fixed in the latest? (5.11.22-7)
 
It is improved, but there can still be leaks. I have found memory leaks are more common on windows VMs on AMD CPUS than my Xeon servers.
 
I have found memory leaks are more common on windows VMs on AMD CPUS than my Xeon servers.

Interesting and way beyond my pay grade :)

Thanks, I'll keep an eye on my test server. For now, we have ballooning disabled on production.