[PVE 9/ZFS-Based VM/LXC Storage] Why Shouldn't I Disable Swap Inside Linux VMs?

Sep 1, 2022
516
194
53
42
I am aware that on bare metal Linux, or a non-ZFS-based Proxmox VE host, there are advantages to having swap enabled.

So, I'm asking specifically about using swap inside VMs stored as zVols on a thin-provisioned ZFS mirror pool. I have a 4 GiB Debian 13 that I've never seen use more than 1.2 GiB of its allotted 4 GiB of RAM, to the point I might restrict it to 2 GiB soon. Debian 13's default ext4 partition scheme sets up a 2 GiB swap partition.

But, I've seen this VM use swap, even only 256 KiB.

I don't really like this. A zVol backed paravirtualized SCSI virtual disk is always going to be slower than accessing RAM directly, and I gave it it 4 GiB because it's an I/O heavy VM. I don't want to introduce avoidable latency.

So, question: What reasons should I keep this config, where an I/O-intensive VM with enough RAM to never run out of memory is swapping into a zVol-backed virtual disk? What are the downsides of turning off swap in this setup?
Thanks. :)
 
I am aware that on bare metal Linux, or a non-ZFS-based Proxmox VE host, there are advantages to having swap enabled.
Swap inside (Linux) VMs is just as advantageous. Having ZFS (or BTRFS) underneath those VMs is not ideal, but should not be a problem (with enterprise drives) unless they start thrashing, which is a problem of and in itself.
Writing every once in a while to swap is good for performance and througput and memory defragmentation. Continuously writing and also reading swap is called thrashing and very bad.
Of course, running applications with not enough real memory and trying to use swap instead is a terrible idea regardless of virtualization and/or ZFS. And Proxmox does not really support memory over-commitment.
 
Last edited:
  • Like
Reactions: Johannes S
Did we mention this (much newer) article in the other threads? https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html
I didn't but will definitevly change this asap ;) One should take into account that for zswap you always need physical swap as backing device. For example if you use the defaults of PVEs installer for ZFS you will endup without space for a dedicated swap device. And swapfiles are not recommended for ZFS since they caused problems in the past (not sure which exactly to be fair). And physical swap will be slower than a ram disc.
Another thing to consider is that depending on your perspective you might end up with a different perspective than a kernel developer.

For example in a comment on Chris Downs excellent article "In defense of swap" Kristian Koehntopp argued that he don't want to come into a situation which need swap in the first place:
Chris writes:

Swap is primarily a mechanism for equality of reclamation, not for emergency “extra memory”. Swap is not what makes your application slow – entering overall memory contention is what makes your application slow.
And that is correct.

The conclusions are wrong, though. In a data center production environment that does not suck, I do not want to be in this situation.

I see it this way:

If I am ever getting into this situation, I want a failure, I want it fast and I want it to be noticeable, so that I can act on it and change the situation so that it never occurs again.
That is, I do not want to survive. I want this box to explode, others to take over and fix the root cause. So the entire section »Under temporary spikes in memory usage« is a DO NOT WANT scenario.

https://blog.koehntopp.info/2018/01...elopers-think-to-how-operations-people-think/

I especially like Koehntopps final statement:
I think the main difference Chris and I seem to have on fault handling - I’d rather have this box die fast and with a clear reason than for it trying to eventually pull through and mess with my overall performance while it tries to do that.


Another examples are Opensearch and Kubernetes who both clearly state in their documentation that you disable swaps on hosts where you are planning to run it. For example Opensearch documentation of their recommended docker-compose files disables swap for this reason:
Swapping can dramatically decrease performance and stability, so you should ensure it is disabled on production clusters.

https://docs.opensearch.org/1.2/opensearch/install/important-settings/

In the initial design of Kubernetes (https://kubernetes.io/blog/2025/03/25/swap-linux-improvements/ ) swap was deemed out of scope:
Prior to version 1.22, Kubernetes did not provide support for swap memory on Linux systems. This was due to the inherent difficulty in guaranteeing and accounting for pod memory utilization when swap memory was involved. As a result, swap support was deemed out of scope in the initial design of Kubernetes, and the default behavior of a kubelet was to fail to start if swap memory was detected on a node.
https://kubernetes.io/blog/2025/03/25/swap-linux-improvements/
Even now it's disabled by default for Linux workloads (pods in Kubernetes lingo):


Swap behaviors
You need to pick a swap behavior to use. Different nodes in your cluster can use different swap behaviors.

The swap behaviors you can choose for Linux nodes are:

NoSwap (default)
Workloads running as Pods on this node do not and cannot use swap.
LimitedSwap
Kubernetes workloads can utilize swap memory.
Note:
If you choose the NoSwap behavior, and you configure the kubelet to tolerate swap space (failSwapOn: false), then your workloads don't use any swap.

However, processes outside of Kubernetes-managed containers, such as systemi services (and even the kubelet itself!) can utilize swap.

You can read configuring swap memory on Kubernetes nodes to learn about enabling swap for your cluster.
https://kubernetes.io/docs/concepts/cluster-administration/swap-memory-management/

So it all boils down how constrained you are in RAM, how important is predictable performance for you and what your workloads demand.
If you have a homelab you are propably more constrained in RAM than anything else (personally I'm fine with performance penaltys caused by zram or KSM) while in an enterprise setting you propably have enough RAM for your VMs and don't want to hurt the Performance by swaping out to physical discs or burning cpu cycles for the compression algorithms. And if you happen to use something like Kubernetes or Opensearch you need a good reason not to go with the recommendations or defaults, which propably isn't the case for most scenarios.
Chris Down himself mentioned Fedoras default setup (which is targeted at usage of Linux as a desktop system so a different beast than servers) as an example where zram makes sense and depending on your goals zram still might be the best approach (after considering his arguments why he recommends using zswap as default):
But memory efficiency wasn't the only thing Fedora was optimising for, and several of their constraints had nothing to do with memory management at all. Within those constraints, the decision is coherent: optimality is always relative to what you're trying to achieve (and that point goes to you too, dear reader – you know better than me what you are trying to do).
https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html

But he also argues that for servers propably you should use zswap instead. Myself being a proponent of zram even on servers his piece gave me a lot to think about. Guess I will need to rethink whether I want to enable zram in my next rework of the vm template at work ;) I was quite tempted to add it since (thanks to the on going RAM price crisis) we are constrained in adding physical RAM so zram seemed like a cheap way to mitigate this at least a little bit.
But now I'm reconsidering whether going zswap or even completly without swap would be the more sensible approach.
So thanks for the link, I love it when new information challenges my assumptions ;)