Windows Server 2016 Memory Leak / Memory Ramping

Elfy

Well-Known Member
Dec 29, 2016
57
55
58
34
Hi all,

First-time poster and recovering ESXi user. Just wanted to say how impressed I am with the Proxmox virtualization platform. The PVE ecosystem is mind-blowing to me for an open source software package. So, to that I say a huge thank you! Keep up the great work!

I've searched for this issue extensively and have come up empty-handed, so I'm hoping someone may be able to point me in the right direction. I'm running a relatively clean installation of Windows Server 2016 as a VM. As you can see, I've set up 8GB of RAM to be automatically allocated, using the Balloon Service, which has been installed and is running. I followed the Windows Guest Best Practices page as closely as I could. All VirtIO drivers are installed and appear to be working as intended. I also have the QEMU Guest Agent services running as well (not sure if relevant, just thought it may be worth mentioning).

What I'm seeing, is upon initialization of the VM, the memory usage looks great, it's only consuming around 1.3GB. After a while (anywhere from 10 minutes to 36 hours) the memory will ramp up to around 97% usage:
yKbRbyD.png


The task manager on the server doesn't seem to show any unusual memory hogging applications but it does show that 94% of the 8GB allocated is being used:
r3C7kpo.png
W2vdgnN.png


I did notice that sometimes (not every time) the memory ramping occurs when I remote desktop in, and I remember reading somewhere that someone had a graphics driver issue with a runaway memory process, but I'm not sure if that applies to my instance. I should also note that the memory never goes back down once it's ballooned, until I restart the VM. Any ideas?
 
Bumping this because it is still an ongoing issue for me. Anyone else seeing this?
 
Interesting. I've noticed that Server 2016 seems to use a lot more RAM than 2012 R2 did on a clean system in my experience but not seen this yet. Just having a look but I've got a DC which has been up for 21 days with AD Connect running on it as well using 3.43GB out of 5GB I've allocated to it. That's got all the VirtIO drivers on, ballooning and Qemu Agent as well so sounds very similar to your configuration.

How about building another server, not putting much to anything on it and just seeing how it goes? Maybe it's something you've installed on the server rather than Proxmox / 2016 on it's own?

Or as you say ballooning is kicking in so do you have memory pressures on your host? ZFS is a good one for causing that if you've not a lot of RAM or left it on the default configuration.
 
Thanks for the reply and advice, FastLaneJB!

Just a little background on this server-- I've been running all my primary servers off my Windows Server 2012 machine which is running on ESX, and has been up for a few years. The Windows Server 2012 has been running great, however I'm in the process of migrating everything to PVE, and whatever I can move over to LXC Containers (like email, web, etc.). I kind of like Windows server for some of my game servers simply because I'm a GUI guy...

Anyway as a setup measure I installed a fresh version of Windows Sever 2016 to PVE, installed IIS, ADDS Light, and my TeamSpeak server, and that's basically it. So it's a pretty clean sever so to speak with very little load at the moment. I guess I could start by uninstalling the ADDS service since it seems to be using the most memory, and see what happens. I'm hesitant to burn another Server 2016 license key just to test it, however that's certainly not out of the question. What may be a good start is to just "refresh" windows and see if the memory ramps up after a while like I've been observing. Thoughts?

Also, I'm not sure how the ZFS would cause memory pressure? Could you elaborate on that?
 
Last edited:
Well if you was to try a new server I wouldn't burn another license key as it's only for a test. The trial version runs for 180 days which should be more than enough time ;)

Clearly as you say that's not a lot running on it and I'd not really expect those services to swallow all that RAM. Looking at task manager that didn't seem the case, seems more likely the balloon service nabbed it. As it runs as a device driver you don't see the RAM it's gobbled in the task manager.

How much RAM does your server have and what are the stats you see if you run "free" on it?

ZFS doesn't show its cached RAM inside the buffered pool on Linux because of the way it works but active RAM. It also grabs a lot of RAM for disk caching but you can limit that so it never takes too much RAM and I think most people here using ZFS do so.

Also if your on the latest test kernel you might be suffering from an out of memory bug that appears to have come about in that kernel. There's another long thread on that with a test kernel listed which you could try.
 
That's right, I forgot MS gives ridiculously long trial periods (no complaints here!). The server has 16GB of DDR4 RAM, and the VM is in fact installed on a ZFS partition (namely the local ZFS). I believe that when I set up the VM, I did not give it any cache which is the default option. I guess this would be an opportune time to add it in. Let me throw it 4GB of cache and see what happens. Which cache mode do you recommend? (Direct-sync, write-through, write-back etc).

As of yesterday, I uninstalled all AD/DS services. I left IIS, and my Teamspeak servers. It has been up for 36 hours, without any memory ballooning. So it could have been AD/DS sucking up the free RAM for some reason. I'm going to run it for a bit longer and see if anything changes. Here is the current memory usage, which looks great.
 
Last edited:
I figured out what was causing the high memory usage. It is actually completely unrelated to Windows, or any application running on my Windows VM. It is a separate Linux CT that is using a LOT of RAM, and somehow "sucking" away RAM from the Windows VM. I don't quite understand it, and this may be a new issue entirely, but at least I can pin it to not being Windows fault (for now ;)). I'll come back and edit this to post all of my memory allocations when I get a moment.
 
Hi There.
I came across your post today. I am having the exact same issue with my file server 2016, built on vmware exsi.
My file server has 24 GB of RAM, and will steadily Balloon up. I have even seen RAM burst up after I RDP into my disconnected console. Memory never gets released, and once it fills up, page file fills up, and server is inaccessible.
Can you please tell me what you did to troubleshoot and correct this issue ?

Looking forward to your reply
Mike
 
Hi Mikenewmediary, thanks for reaching out.

Unfortunately I posted this over 3 years ago and my memory is rather foggy on how exactly I ended up working this specific issue out.

In general, I would assume guest memory issues are NOT Proxmox but rather the guest OS. Proxmox is really good at managing host resources, but the guest OS or services running within the guest OS might not be so good. For example, I occassionally see memory ballooning issues with my Debian 9 VM due to runaway processes. To fix the ballooning issues on my Deb 9 VM I simply reboot periodically and that takes care of it for a while. The Deb 9 VM is running Java services which are poorly optimized and have little to no garbage collection.

I have found that sometimes poor implementation of a Windows service (i.e. IIS or Exchange server) can cause memory ballooning. Using Windows tools to track down what process is hogging memory might be your best bet at debugging memory issues. I would also check and verify the following (Windows on Proxmox only, although VMWare has its own memory management utilities):
  • Your VM is running the latest VirtIO balloon and qemu drivers.
  • The VirtIO balloon service is up and running on the guest OS.
    • Check services and see if "Balloon Service" shows in the list and is running.
  • The QEMU Guest Agent service is up and running on the guest OS.
    • Check services and see if "QEMU Guest Agent" and "QEMU Guest Agent VSS Provider" services show un the list and are up and running.
If you do not have the QEMU Guest Agent service running on the Windows guest, the memory reported to Proxmox will be wrong. Also, if you don't have the ballooning service running on the guest OS, then Proxmox won't be able to efficiently manage the memory of the guest for you. If you are missing any of the above steps or need more info, see the Windows Guest Best Practices pages on the Proxmox wiki. There's one for Windows 10, Server 2012, and Windows 8, etc.

Hope this helps,
-Elfy
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!