[PVE 9/ZFS-Based VM/LXC Storage] Why Shouldn't I Disable Swap Inside Linux VMs?

SInisterPisces · Apr 14, 2026

I am aware that on bare metal Linux, or a non-ZFS-based Proxmox VE host, there are advantages to having swap enabled.

So, I'm asking specifically about using swap inside VMs stored as zVols on a thin-provisioned ZFS mirror pool. I have a 4 GiB Debian 13 that I've never seen use more than 1.2 GiB of its allotted 4 GiB of RAM, to the point I might restrict it to 2 GiB soon. Debian 13's default ext4 partition scheme sets up a 2 GiB swap partition.

But, I've seen this VM use swap, even only 256 KiB.

I don't really like this. A zVol backed paravirtualized SCSI virtual disk is always going to be slower than accessing RAM directly, and I gave it it 4 GiB because it's an I/O heavy VM. I don't want to introduce avoidable latency.

So, question: What reasons should I keep this config, where an I/O-intensive VM with enough RAM to never run out of memory is swapping into a zVol-backed virtual disk? What are the downsides of turning off swap in this setup?
Thanks.

leesteken · Apr 14, 2026

SInisterPisces said:
I am aware that on bare metal Linux, or a non-ZFS-based Proxmox VE host, there are advantages to having swap enabled.

Swap inside (Linux) VMs is just as advantageous. Having ZFS (or BTRFS) underneath those VMs is not ideal, but should not be a problem (with enterprise drives) unless they start thrashing, which is a problem of and in itself.
Writing every once in a while to swap is good for performance and througput and memory defragmentation. Continuously writing and also reading swap is called thrashing and very bad.
Of course, running applications with not enough real memory and trying to use swap instead is a terrible idea regardless of virtualization and/or ZFS. And Proxmox does not really support memory over-commitment.

BobhWasatch · Apr 14, 2026

It is very likely to be swapping things out that then never get swapped back in. In which case there's no harm being done, and a slight good in that there's more RAM for caching.

Onslow · Apr 14, 2026

https://chrisdown.name/2018/01/02/in-defence-of-swap.html
- I've found it very interesting

Impact · Apr 14, 2026

I use ZRAM everywhere. Makes partitioning simpler and backups smaller.
Also discussed here: https://forum.proxmox.com/threads/to-swap-or-not-to-swap.181314/

UdoB · Apr 15, 2026

Did we mention this (much newer) article in the other threads? https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html

I must admit that I haven't had zswap on my agenda at all. I read the article twice - and did not understand it completely... :-(

alexskysilk · Apr 15, 2026

zswap is great, but is no substitute for ram. if you are using a production environment (read: for money) just provision sufficient ram. yes its expensive, but its worth much more in consistent performance.

IsThisThingOn · Apr 15, 2026

I disable it.

Johannes S · Apr 15, 2026

UdoB said:
Did we mention this (much newer) article in the other threads? https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html

I didn't but will definitevly change this asap

One should take into account that for zswap you always need physical swap as backing device. For example if you use the defaults of PVEs installer for ZFS you will endup without space for a dedicated swap device. And swapfiles are not recommended for ZFS since they caused problems in the past (not sure which exactly to be fair). And physical swap will be slower than a ram disc.
Another thing to consider is that depending on your perspective you might end up with a different perspective than a kernel developer.

For example in a comment on Chris Downs excellent article "In defense of swap" Kristian Koehntopp argued that he don't want to come into a situation which need swap in the first place:

Chris writes:

Swap is primarily a mechanism for equality of reclamation, not for emergency “extra memory”. Swap is not what makes your application slow – entering overall memory contention is what makes your application slow.

Click to expand...

And that is correct.

The conclusions are wrong, though. In a data center production environment that does not suck, I do not want to be in this situation.

I see it this way:

If I am ever getting into this situation, I want a failure, I want it fast and I want it to be noticeable, so that I can act on it and change the situation so that it never occurs again.

Click to expand...

That is, I do not want to survive. I want this box to explode, others to take over and fix the root cause. So the entire section »Under temporary spikes in memory usage« is a DO NOT WANT scenario.

https://blog.koehntopp.info/2018/01...elopers-think-to-how-operations-people-think/

I especially like Koehntopps final statement:

I think the main difference Chris and I seem to have on fault handling - I’d rather have this box die fast and with a clear reason than for it trying to eventually pull through and mess with my overall performance while it tries to do that.

Another examples are Opensearch and Kubernetes who both clearly state in their documentation that you disable swaps on hosts where you are planning to run it. For example Opensearch documentation of their recommended docker-compose files disables swap for this reason:

Swapping can dramatically decrease performance and stability, so you should ensure it is disabled on production clusters.

https://docs.opensearch.org/1.2/opensearch/install/important-settings/

In the initial design of Kubernetes (https://kubernetes.io/blog/2025/03/25/swap-linux-improvements/ ) swap was deemed out of scope:

Prior to version 1.22, Kubernetes did not provide support for swap memory on Linux systems. This was due to the inherent difficulty in guaranteeing and accounting for pod memory utilization when swap memory was involved. As a result, swap support was deemed out of scope in the initial design of Kubernetes, and the default behavior of a kubelet was to fail to start if swap memory was detected on a node.
https://kubernetes.io/blog/2025/03/25/swap-linux-improvements/

Even now it's disabled by default for Linux workloads (pods in Kubernetes lingo):

Swap behaviors
You need to pick a swap behavior to use. Different nodes in your cluster can use different swap behaviors.

The swap behaviors you can choose for Linux nodes are:

NoSwap (default)
Workloads running as Pods on this node do not and cannot use swap.
LimitedSwap
Kubernetes workloads can utilize swap memory.
Note:
If you choose the NoSwap behavior, and you configure the kubelet to tolerate swap space (failSwapOn: false), then your workloads don't use any swap.

However, processes outside of Kubernetes-managed containers, such as systemi services (and even the kubelet itself!) can utilize swap.

You can read configuring swap memory on Kubernetes nodes to learn about enabling swap for your cluster.
https://kubernetes.io/docs/concepts/cluster-administration/swap-memory-management/

So it all boils down how constrained you are in RAM, how important is predictable performance for you and what your workloads demand.
If you have a homelab you are propably more constrained in RAM than anything else (personally I'm fine with performance penaltys caused by zram or KSM) while in an enterprise setting you propably have enough RAM for your VMs and don't want to hurt the Performance by swaping out to physical discs or burning cpu cycles for the compression algorithms. And if you happen to use something like Kubernetes or Opensearch you need a good reason not to go with the recommendations or defaults, which propably isn't the case for most scenarios.
Chris Down himself mentioned Fedoras default setup (which is targeted at usage of Linux as a desktop system so a different beast than servers) as an example where zram makes sense and depending on your goals zram still might be the best approach (after considering his arguments why he recommends using zswap as default):

But memory efficiency wasn't the only thing Fedora was optimising for, and several of their constraints had nothing to do with memory management at all. Within those constraints, the decision is coherent: optimality is always relative to what you're trying to achieve (and that point goes to you too, dear reader – you know better than me what you are trying to do).
https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html

But he also argues that for servers propably you should use zswap instead. Myself being a proponent of zram even on servers his piece gave me a lot to think about. Guess I will need to rethink whether I want to enable zram in my next rework of the vm template at work

I was quite tempted to add it since (thanks to the on going RAM price crisis) we are constrained in adding physical RAM so zram seemed like a cheap way to mitigate this at least a little bit.
But now I'm reconsidering whether going zswap or even completly without swap would be the more sensible approach.
So thanks for the link, I love it when new information challenges my assumptions

UdoB · Apr 16, 2026

Johannes S said:
for zswap you always need physical swap as backing device.

Correct. That's the reason I didn't test it myself, yet.

Johannes S said:
...his piece gave me a lot to think about. Guess I will need to rethink whether I want to enable zram in my next rework

Please do not hesitate to tell us you findings

Fortunately my professional cluster has more physical Ram than it needs. And for my homelab I bought some MiniPC with 128 GiB Ram just before the current craziness emerged.

Now for SSD/NVMe... oooh, oh! Damned! :-(

IsThisThingOn · Apr 16, 2026

UdoB said:
Fortunately my professional cluster has more physical Ram than it needs.

Same here. How else could I speed up reads with ARC?

Johannes S · Apr 16, 2026

IsThisThingOn said:
Same here. How else could I speed up reads with ARC?

At work ARC is not my concern since my employer is stuck with vmware. The RAM crisis is real though but I was hesitant to enable zram via an ansible-Playbook since I feared potential side effects. Thus I planed to do some benchmarking which is moot now. Seemed I did the right call if we consider Downs Argument in regard to cgroups isolation and the hard memory limit if one doesn't use something like systemd-oomd.

For my homelab I will continue using zram though due to the lack of physical swap space

[PVE 9/ZFS-Based VM/LXC Storage] Why Shouldn't I Disable Swap Inside Linux VMs?

SInisterPisces

Well-Known Member

leesteken

Distinguished Member

BobhWasatch

Distinguished Member

Onslow

Renowned Member

Impact

Distinguished Member

UdoB

Distinguished Member

alexskysilk

Distinguished Member

IsThisThingOn

Renowned Member

Johannes S

Distinguished Member

UdoB

Distinguished Member

IsThisThingOn

Renowned Member

Johannes S

Distinguished Member

We value your privacy