Pagefile problem on Windows guests when memory hotplug is enabled

Hi,

We're having a problem on Windows guests prior to 2016 when enabling memory hotplug. As soon as we enable it, Windows cannot assign more than 4096mo of pagefile inside the guest OS. So when the machine is busy, there is low memory error messages and sometimes it becomes unresponsive. We do not have this behavior on 2019 guests.

Seems like when hotplug is enabled, Windows does not consider the "physical ram" the same way. The pagefile stays at 4096mo all the time and does not grow even when needed. Here are the steps we took to reproduce the problem:

On freshly installed 2012r2, 2016 and 2019 vms, we configured 6GB of physical RAM for each machine and enabled memory hotplug. We've ran a memory stress test software (HeavyLoad from JAM software) on the machine to check how the machine reacts and use the pagefile when all the physical memory is used. The pagefile grew until it hit 4096Mo and stopped there, error messages started to show in the event viewer about missing virtual memory and windows started to choke. This problem was only visible on the 2012R2 and 2016. The 2019 server page file continued to grow normally to 17GB and we've had no error messages.

We've ran the same tests without hotplug enabled. The pagefile of all vms grew normally to 20GB+ and we've had no error messages.

We tried to switch it back again with hotplug enabled. The pagefile of the 2012R2 and 2016 immediately turned down to 4096mo even if the OS suggested "18GB recommended" in the pagefile settings. Error messages started again.


So it is certainly related to how Windows sees the hotpluggable memory. I was wondering, is there a known fix or workaround for this? Is there newer drivers that I should install on my older 2012r2/2016 guests?

Thank you very much
 
Hi,

are the Windows updated and include all fixes?
 
Hi,

Thanks for your reply. Yes they are, I even tried to update manually the pnpmem.sys driver for the hotplug memory devices. I do not know if it is what causes Windows to not adjust its pagefile, but I am testing everything I can right now.

Seems like Windows only adjust its pagefile according to the qemu -m parameter which is the fixed memory. It do not take into consideration the hotpluggable memory devices.
 
Last edited:
Can you send a config from the nonworking Windows VM?
 
Hi,

Here is the vm config.

Code:
agent: 1
balloon: 1024
bootdisk: scsi0
cores: 4
cpu: host
hotplug: disk,network,usb,memory,cpu
ide2: none,media=cdrom
ide3: none,media=cdrom
memory: 4096
name: TEST-2k16
net0: virtio=52:E7:12:E3:61:92,bridge=vmbr0,tag=1000
numa: 1
ostype: win10
scsi0: VM_DATASTORE:vm-167-disk-0,backup=0,discard=on,size=50G
scsihw: virtio-scsi-pci
smbios1: uuid=16455dd1-2e42-40a4-82c2-f6a4caa033ce
sockets: 1
vmgenid: 9b5a043b-17d9-4916-80c6-7c9d1073cb40

It is very easy to reproduce. Just enable memory hotplug and try to run HeavyLoad (by JAM software) or any other load software on RAM. Watch the "physical RAM" getting full and the pagefile starting to grow. When it will hit 4096MB, the system will start to bug down with error messages and eventually crash.

Retry the same process with memory hotplug disabled. The pagefile will continue to grow over 4096MB and the system will work fine until you run out of disk space for the pagefile which is normal.

Note: Again, this is true for Windows Server 2016 and prior versions
 
From Microsoft:
3 × RAM or 4 GB, whichever is larger. This is then limited to the volume size ÷ 8. However, it can grow to within 1 GB of free space on the volume if required for crash dump settings.


I think that Microsoft just do not see the hotpluggable DIMM devices as physical RAM. So its only calculating its pagefile based on the -m parameter which is very low when hotplug is enabled. So if the -m parameter is only 1G, the system pagefile will only grow to 4096mo which is the largest of the two between 3 x RAM and 4GB.
 
The 2019 server page file continued to grow normally to 17GB and we've had no error messages.
As you said it yourself, Windows Server 2019 does not observe this behavior. Microsoft seems to have addressed this issue and fixed it in the current Windows Server version.
 
If the VMs run the qemu-guest-agent then you could push a command into the VM. This might make it possible to change the setting on the fly. ;)
 
  • Like
Reactions: iansimoneau