You can use ballooning, in which case the host retrieves the RAM from the VM. However, ballooning does not care whether a VM urgently needs the RAM or not. If you have set 8GB max RAM and 4GB min RAM for a VM and the VM is currently using 5GB for system/user processes and 3GB for caching, then ballooning will slowly rob the RAM until the VM is down to 4GB. First the VM will empty the 3GB cache, but after it is emptied ballooning will not stop and will take another GB of RAM. The VM then uses 1GB too much RAM, but also has no more caches that can be discarded, so the VM has to kill processes (because OOM) until 1GB has become free.
Conclusion: RAM overprovisioning does not really work. You should not allocate more RAM to the guests than the host actually has available. And so that the guests do not waste RAM unnecessarily through caching, it is best to only allocate just as much RAM to the guests as they need to be able to run.
Translated with DeepL.com (free version)
And another posting from Dunuin:
VMs run via KVM and the KVM process starts with a small memory footprint which corresponds to the guest system. However, KVM never seems to release RAM. The KVM process can therefore only grow when the guest uses more RAM, but never shrink again when the guest no longer needs the RAM.
If you don't give your guests more RAM than your server actually has, then it doesn't matter if the RAM is always full after some time. This only bothers you if you want RAM overprovisioning. I would not allocate more than 56 GB or 52 GiB RAM to the guests in total with your hardware (possibly 5-10GB more, depending on how much RAM KSM deduplicates for you). If you give your guests more RAM in total, then you are overprovisioning and run the risk of the OOM killer killing VMs.
And if a VM does not use all of its RAM for system/user processes, then the RAM is not directly wasted. Then the VM can still use the RAM for caching, which then increases performance.
Translated with DeepL.com (free version)