[feature suggestion] Bulk Migration Window with VM Memory Tracking

maurin

Renowned Member
Oct 10, 2013
8
0
66
Hello,

I regularly use the bulk migration feature to move my VMs from one node to another during upgrades requiring a restart, such as kernel updates. While performing this task, I have encountered two instances in the past few months where I miscalculated the RAM size of a VM and triggered an OOM killer on the receiving node!

My nodes and VMs have very small swap space on disk and I utilize zswap within the hypervisors (data within the VM processes is highly compressible).

Having an indicator for the RAM usage for each VMs in the migration window selector would reduce the risk of saturation on the destination node.

Thank's
Sylvain Maurin
 
Having an indicator for the RAM usage for each VMs in the migration window selector would reduce the risk of saturation on the destination node.
The RAM usage for each VM is wrong. Has always been and will always be. It's just an indicator and if you run windows, it'll even more lie to you, because Windows also lies about the real ram usage and every windows admin wants to see the lies the taskmanager is telling (just search the forums). Therefore PVE had to be changed to reflect the memory lies. Sadly, this is neither a joke nor an exaggeration.

You need at least the memory your VMs have in order to be able to migrate it. Every KVM/QEMU process needs additional ram, if you have enabled anything besides "no cache" for your disks, you will also need more RAM, your virtualized devices (e.g. your graphics card) need additional ram and so on. This is not planable and heavy depending on your settings and how long the VM already runs. This is also not PVE-specifc, it's the same in every hypervisor - mileage may vary.
 
The RAM usage for each VM is wrong. Has always been and will always be. It's just an indicator and if you run windows, it'll even more lie to you, because Windows also lies about the real ram usage and every windows admin wants to see the lies the taskmanager is telling (just search the forums). Therefore PVE had to be changed to reflect the memory lies. Sadly, this is neither a joke nor an exaggeration.

I disagree : any KVM process on the hypervisor is just another Linux process, and the amount of RAM collected from /proc/[pid]/smaps is fairly accurate.
You need at least the memory your VMs have in order to be able to migrate it. Every KVM/QEMU process needs additional ram, if you have enabled anything besides "no cache" for your disks, you will also need more RAM, your virtualized devices (e.g. your graphics card) need additional ram and so on. This is not planable and heavy depending on your settings and how long the VM already runs. This is also not PVE-specifc, it's the same in every hypervisor - mileage may vary.
Compressed swap in memory does not help, but, by subtracting zswap.max_pool_percent ,from RAM, you can determine the minimal memory space. Taking a look at VM memory during migration would allow someone like me, who is not very experienced, to avoid moving any VM that may trigger the OOM killer.
 
I disagree : any KVM process on the hypervisor is just another Linux process, and the amount of RAM collected from /proc/[pid]/smaps is fairly accurate.
Okay, you know your memory and I totally agree. My initial answer was a couple of abstraction levels above. The approach via smaps if much better than the one present in PVE, which is tailored to having a ram usage just rely on the configured RAM, which IS in fact "not fairly accurate", even a lot off. I can rember discussing this "virtualization" overhead also on the forums but cannot find it anymore. AFAIK, we had an overhead of 5-15% per VM, depending on the additional size of the display memory, disk cache etc.


Compressed swap in memory does not help, but, by subtracting zswap.max_pool_percent ,from RAM, you can determine the minimal memory space. Taking a look at VM memory during migration would allow someone like me, who is not very experienced, to avoid moving any VM that may trigger the OOM killer.
I haven't talked about that, but that is also one point, yes.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!