Windows 10 VM Ballooning -> almost 100% Ram usage

rumble06 · Jul 14, 2019

After enabling ballooning and following the wiki (installed drivers and got ballooningservice to run on windows) instead of the RAM Usage going lower inside the proxmox panel, instead the RAM went up inside the VM. Minimum RAM: 2048MB, Total Ram allocated to the VM: 14336MB. However, usage is at 90% now not only on the proxmox panel but on the VM Task Manager too.

Edit: Uninstall the ballooningservice or stopping it didn't helped, the RAM usage was still over 90% all the time (even right after the boot), which is over 12GB out of the 14GB allocated to the VM.
But, after I pressed "Uninstall device" on the Virtio Ballooning from device manager the problem was fixed, current RAM usage is only 2.5GB inside the VM (however the proxmox is not reporting that, that's why I want to use the ballooning service).

Edit2: It seems like there is a memory leak on the ballooning win10 driver. After I installed the driver again I checked task manager, ram usage is +100MB every 3-4 seconds and it won't stop.

tim · Sep 5, 2019

which version of virtio drivers are you using

elagil · Jan 4, 2020

I have the same problem in Windows 10 1909, just freshly installed.

I tried the latest (1.173), as well as the stable version (1.171) and I made sure that the ballooning service was running and drivers installed. Still, I get increasing RAM usage when the service is running.

Another problem I faced with two Windows 10 machines and ballooning enabled: Proxmox completely froze and reset with the following settings and when both VMs were started:

VM1: 2GB min RAM, 8GB max RAM
VM2: 4GB min RAM, 16GB max RAM

I have 16GB of system memory and no other VMs running, so I expected this to work. Ballooning was enabled for both, and the ballooning service was running (drivers also installed). RAM usage was correctly reported for both machines in the proxmox web interface, when run individually.

tim · Jan 7, 2020

What's in your syslog? I would guess that your host ran out off memory and that's why it crashed.
If you have only 16GB memory total and one of your VMs is allowed to take all that memory it isn't surprising that your host struggles.

elagil · Jan 7, 2020

The machines in idle only take up around 3GB of RAM, so from the guest side, RAM usage was not excessive. Is ballooning not supposed to manage RAM, so I can overprovision it?

If ballooning is enabled in the Proxmox web-GUI, there seems to be a memory leak and every 3-4 seconds, RAM usage increases within Windows by 100 MB. I found out that even after disabling the ballooning service, this behaviour persists, thus I assume the ballooning driver to be the reason for the memory leak. Disabling ballooning in the GUI also stops the memory leaks.

I will later check the syslog.

tim · Jan 7, 2020

elagil said:
The machines in idle only take up around 3GB of RAM, so from the guest side, RAM usage was not excessive. Is ballooning not supposed to manage RAM, so I can overprovision it?

It is, but you told us that ballooning isn't working inside your VM for whatever reason therefor it is probably not working as expected.
Without logs and stats from that time it's hard to say anything about it.

elagil · Jan 7, 2020

Ok, I will soon return with more information. Could you point me to relevant log files apart from syslog?

elagil · Jan 8, 2020

The testing configuration was as follows:

VM1/VM2 each have min. 2GB RAM and max. 8GB RAM set with ballooning enabled. Reporting of RAM usage to the Proxmox user interface is working correctly, meaning that Windows and Proxmox RAM usage display is consistent. RAM usage drops slowly from 90% over time, until it reaches a normal level.

However, when one machine is running (idling with around 3 GB used RAM), starting the second one still crashed Proxmox completely. The syslog does not show any additional output when the crash happens, when watched with "tail -f /var/log/syslog" over ssh, as well as from the shell in the web interface on the client VM. With and without ballooning, the contents of the syslog on launch of a VM are the same. Unfortunately, I missed to capture it and will do so later.

elagil · Jan 9, 2020

When both machines are running with more conservative memory settings (VM1: 1-8 GB RAM, VM2: 1-6 GB RAM, so no overprovisioning), this is what I get from "free -h" after some idling period of 5 minutes and with ballooning enabled:

Bash:

root@pc:~# free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi        14Gi       208Mi        17Mi       461Mi       316Mi
Swap:         8.0Gi       545Mi       7.5Gi

It does not look like there is a lot of memory available to the host.

I also do not understand why the host crashes in case of overprovisioning, even if no RAM was available. There is still enough swap space for it to use.

Find the stats of the two VMs (vm1.png/vm2.png) attached. Both are running with ballooning and RAM reporting is accurate. "BLNSRV.exe" is present as a process in task manager in both, and the Ballooning driver is also visible in device manager and running fine.

The syslog when running these two VMs is attached.

I noticed that the Windows VMs become slow to react when the RAM is as full as it is with these settings (see ram.png).

For completeness, find the configuration settings here:

VM1:

Bash:

root@pc:~# qm config 101
agent: 1
balloon: 1024
bios: ovmf
boot: cdn
bootdisk: scsi0
cores: 8
cpu: host
efidisk0: local-lvm:vm-101-disk-1,size=128K
hostpci0: 0a:00,x-vga=1,pcie=1,romfile=GTX1070.bin
hostpci1: 0b:00.3,pcie=1
machine: q35
memory: 8192
name: Windows10a
net0: virtio=F6:FB:DE:3A:E5:8B,bridge=vmbr0,firewall=1
numa: 1
onboot: 1
ostype: win10
scsi0: local-lvm:vm-101-disk-0,backup=0,cache=writeback,iothread=1,size=128G,ssd=1
scsi2: /dev/disk/by-id/ata-CT2000MX500SSD1_1825E144C0C8,size=1953514584K
scsihw: virtio-scsi-single
smbios1: uuid=272870d3-f14e-4ae2-b9b9-a47e1b6ecd4b
sockets: 1
vga: none
vmgenid: b5558763-7896-4b51-bff3-ad1cc7627879

VM2

Bash:

root@pc:~# qm config 110
agent: 1
balloon: 1024
bios: ovmf
bootdisk: scsi0
cores: 8
cpu: host
efidisk0: local-lvm:vm-102-disk-1,size=128K
hostpci0: 01:00,pcie=1,x-vga=1
machine: q35
memory: 6144
name: Windows10b
net0: virtio=CE:21:FB:FA:E2:41,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsi0: local-lvm:vm-102-disk-0,backup=0,cache=writeback,iothread=1,size=128G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=b8d50aee-0211-4e45-ae22-b284360bbf0c
sockets: 1
usb0: host=0416:0123
usb1: host=feed:2260,usb3=1
usb2: host=1532:0016
vga: none
vmgenid: 3b7dd095-86d4-4712-9ea6-4f18ee215381

Do you have some ideas where to look next?

tim · Jan 9, 2020

That's not the syslog from the crash, that's just an excerpt from when you started both VM's. What's using the memory in your VM?

elagil · Jan 9, 2020

Yes, it is not from the crash, I will record another one when it crashes in the evening.

In the list of tasks, there is no process that uses much memory at all. I cannot see the source of memory consumption. Blnsrv.exe also uses very little, in the order of 1 MB. I will also post a screenshot of the process list when I have the chance.

Are there hidden processes that the task manager does not show by default?

Edit: I will have a look at the resource monitor.

tim · Jan 9, 2020

I actually don't know if this is the case for windows, but I guess it could be that some processes are hidden or do not adhere to the task manager protocol or so.

Aren't the syslog files from the last crash still available?

elagil · Jan 9, 2020

I actually don't know if this is the case for windows, but I guess it could be that some processes are hidden or do not adhere to the task manager protocol or so.

Yes, I will try and check the resource monitor that should also include system processes that I cannot see in Task manager.

Aren't the syslog files from the last crash still available?

Maybe, but I don't know which one it is in the log rotation. I will just let it crash again.

elagil · Jan 9, 2020

Find a crash syslog attached. Here, one VM was rebooted with "unsafe" settings, meaning 1GB min. RAM and 12GB max. RAM, while the other machine has 2GB min. RAM and 6GB max. RAM. The reboot happens here:

Code:

Jan  9 18:35:52 pc qmeventd[874]: Restarting VM 101

Anyway, I cannot see anything useful in this log. After 18:36:00, the logs continue after rebooting.

In resmon/poolmon, I cannot see any excessive ram usage from processes or drivers at all.

tim · Jan 10, 2020

This is again just a small part of the syslog, without any information about a crash.

elagil · Jan 10, 2020

I understand your concern but it is all the syslog I have. During the crash, nothing is added. I got this with "cat /var/log/syslog" after reboot.

tim · Jan 10, 2020

Just to make this clear, we are still talking about the Proxmox host crashing?

AFAICS this was "tail" not "cat" and I doubt that there is nothing more in your syslog then 55 lines starting a 6 p.m.

elagil · Jan 10, 2020

The system behaviour is like this: I reboot the offending VM with memory settings that lead to overprovisioning, everything freezes (VMs, web GUI, SSH becomes unreactive), and the proxmox host reboots.

I then extracted the syslog after the forced reboot, which, of course, is very long. You are correct, I used tail and this is not the complete syslog. Sorry for the confusion. Still, I posted the part that includes everything between these two events:

Code:

Jan  9 18:35:46 pc pvedaemon[6712]: requesting reboot of VM 101: UPID:pc:00001A38:00021DA5:5E176472:qmreboot:101:root@pam:

This is where VM 101 was restarted with changed settings, so that RAM is overprovisioned. The logs continue until

Code:

Jan  9 18:36:00 pc systemd[1]: Starting Proxmox VE replication runner...

which is the last line that syslog contains before the system is rebooted due to the crash.

The next line came a few minutes later and contains proxmox startup messages. I feel like I should have included those lines too to avoid the miscommunication, so please excuse this. Currently, I do not have access to the machine, so I cannot attach the rest.

As I mentioned, during the actual crash, no lines are added to the syslog, the system just stops.

Vaielab · Jan 10, 2020

Hi there,
I think I'm having the same problem.
I see that just like me, you are doing gpu passthrough on your vm while trying to do a memory ballooning.

If you only start 1 vm with ballooning (with safe number), and compare the memory used from inside the vm to the memory used from proxmox, does the memory ever decrease or it always stay at the maximum ballooning?
Also could you redo the same test but without gpu passthrough see if it's working?

tim · Jan 17, 2020

The thing is, qemu needs to allocate the whole memory up front if you use gpu passthrough, because of the memory mapping required for this.

Windows 10 VM Ballooning -> almost 100% Ram usage

New Member

Attachments

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Member

Attachments

Member

Attachments

Proxmox Staff Member

Member

Proxmox Staff Member

Member

Member

Attachments

Proxmox Staff Member

Member

Proxmox Staff Member

Member

New Member

Proxmox Staff Member

We value your privacy