How to correctly configure swap nowadays?

Sep 13, 2022
69
9
8
Hi,

I would like to know how to correctly configure swap nowadays. In the past we just added swap to my LVM mirror and a line to /etc/fstab, but since then people decided for systemd which infects more and more of the system. Being open I tried to align and use systemd correctly, but I now spent more than two hours reading and trying but I'm too stupid to correctly configure swap and thus please need some hints, or an example.

Background: I installed Proxmox which did not add swap (which I though was intentional, but swap is recommended). I learned that I must not put swap on ZFS if I want a safe system, because ZFS may want swap to access ZFS vols which thus cannot be on swap. Fortunately, I left a bit of space on the SSDs, so I could add at least a bit of swap space. Google brings up a lot of articles about swap, but all using the deprecated /etc/fstab, where a single line was sufficient. Now I need to use the (said to be) modern, simple systemd approach.

Since sometimes drives re-order (for example, I had issues with ZFS where /dev/sda and /dev/sdb were renamed after a reboot), so I read we should use UUIDs. I tried to create the simple units replacing the complicated single line by just a few simple commands:

Code:
root@pve:~# fdisk -l /dev/sd[cd]|grep swap
Partition 1 does not start on physical sector boundary.
/dev/sdc4  1853884416 1875384974   21500559  10.3G Linux swap
Partition 1 does not start on physical sector boundary.
/dev/sdd4  1853884416 1875384974   21500559  10.3G Linux swap
root@pve:~#
root@pve:~# blkid /dev/sd[cd]4
/dev/sdc4: UUID="968b2b23-9e51-4ac2-99d1-610200f91091" TYPE="swap" PARTUUID="8ab263bc-34ee-0b4f-9534-d7b0f53788a7"
/dev/sdd4: UUID="d32e80c3-a944-493b-afd7-ca1a779b721f" TYPE="swap" PARTUUID="e403401e-9286-e54b-9f79-47f11ce4945f"

root@pve:~# systemd-escape --suffix=swap --path /dev/disk/by-uuid/968b2b23-9e51-4ac2-99d1-610200f91091
dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap
root@pve:~# systemd-escape --suffix=swap --path /dev/disk/by-uuid/d32e80c3-a944-493b-afd7-ca1a779b721f
dev-disk-by\x2duuid-d32e80c3\x2da944\x2d493b\x2dafd7\x2dca1a779b721f.swap
root@pve:~# systemd-escape --suffix=swap --path /dev/disk/by-uuid/968b2b23-9e51-4ac2-99d1-610200f91091
dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap

root@pve:~# echo -e "[Swap]\nOptions=defaults\nTimeoutSec=5" >> '/etc/systemd/system/dev-disk-by\x2duuid-d32e80c3\x2da944\x2d493b\x2dafd7\x2dca1a779b721f.swap'
root@pve:~# echo -e "[Swap]\nOptions=defaults\nTimeoutSec=5" >> '/etc/systemd/system/dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap'

root@pve:~# systemctl daemon-reload

yes, "systemd-escape" produces "escaped" file names (!) that must be escaped for the shell (!),
no, the systemd bash completion does not complete the nice and simple unit names like dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap.

But it does not work:

Code:
root@pve:~# systemctl list-units --type=swap --all
  UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
root@pve:~#root@csdpve:~# systemctl enable 'dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap'
The unit files have no installation config (WantedBy=, RequiredBy=, Also=,
Alias= settings in the [Install] section, and DefaultInstance= for template
units). This means they are not meant to be enabled using systemctl.

Possible reasons for having this kind of units are:
• A unit may be statically enabled by being symlinked from another unit's
  .wants/ or .requires/ directory.
• A unit's purpose may be to act as a helper for some other unit which has
  a requirement dependency on it.
• A unit may be started when needed via activation (socket, path, timer,
  D-Bus, udev, scripted systemctl call, ...).
• In case of template units, the unit is meant to be enabled with some
  instance name specified.

although the manpage says "Swap unit files may include [Unit] and [Install] sections", "may", not "must" (apparently a bug).

Then I tried

Code:
root@pve:~# echo -e "[Install]\nWantedBy=multi-user.target\n[Swap]\nOptions=defaults\nTimeoutSec=5" >> '/etc/systemd/system/dev-disk-by\x2duuid-d32e80c3\x2da944\x2d493b\x2dafd7\x2dca1a779b721f.swap'
root@pve:~# echo -e "[Install]\nWantedBy=multi-user.target\n[Swap]\nOptions=defaults\nTimeoutSec=5" > '/etc/systemd/system/dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap'
root@pve:~# systemctl enable 'dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap'
root@pve:~# systemctl enable 'dev-disk-by\x2duuid-d32e80c3\x2da944\x2d493b\x2dafd7\x2dca1a779b721f.swap'
root@pve:~# systemctl start 'dev-disk-by\x2duuid-d32e80c3\x2da944\x2d493b\x2dafd7\x2dca1a779b721f.swap'
root@pve:~# systemctl start 'dev-disk-by\x2duuid-968b2b23\x2d9e51\x2d4ac2\x2d99d1\x2d610200f91091.swap'

(I'm so glad that I can use these simple commands instead of adding two lines to fstab).

I verified the result:

Code:
root@pve:~# cat /proc/swaps
Filename                                Type            Size            Used            Priority
/dev/sdd4                               partition       10750272        5120            -2
/dev/sdc4                               partition       10750272        0               -3

Obvioulsy there now is no redundancy anymore!

So I just downgraded my ECC-RAM, redundant-disk-with-ZFS-checksums down to "I must trust each individual medium for my most valuable active data" and on a single disk failure I can expect my system to fail. That's not the way.

How to add swap correctly?

Should I use mdadm or lvm to create a mirror? LVM was used to have safe autodetect, but I read that since introduction of systemd a lot of boot problems happen (especially after updates), for example because UUIDs change but are not updated in initrd and what not.

Any hints appreciated!
 
Last edited:
Google brings up a lot of articles about swap, but all using the deprecated /etc/fstab, where a single line was sufficient. Now I need to use the (said to be) modern, simple systemd approach.
/etc/fstab still works and is still my prefered way. Way should you use a more complicated one in favor of a simple, decade-proven way?


Should I use mdadm or lvm to create a mirror?
Yes, or just use a raid controller, if you already have.

How to add swap correctly?
I can recommend to use zram as a first instance of swap and then a disk-based swap (maybe also multi-tiered).
 
Should I use mdadm or lvm to create a mirror?

Because mdadm is not installed by default, I used LVM:

Code:
root@pve:~# fdisk -l /dev/sd[cd] 2>/dev/null |grep LVM
/dev/sdc4  1853884416 1875384974   21500559  10.3G Linux LVM
/dev/sdd4  1853884416 1875384974   21500559  10.3G Linux LVM

root@pve:~# pvcreate /dev/sd[cd]4
root@pve:~# vgcreate swap0 /dev/sd[cd]4
  Volume group "swap0" successfully created

root@pve:~# lvcreate --mirrors 1 -l 100%FREE --name swap-mirror swap0

root@pve:~# mkswap /dev/swap0/swap-mirror
Setting up swapspace version 1, size = 10.2 GiB (11001655296 bytes)
no label, UUID=aefab30c-a7e0-4a19-8eb1-2405d47c71a4

root@pve:~# echo -e "[Install]\nWantedBy=multi-user.target\n[Swap]\nOptions=defaults\nTimeoutSec=5" > '/etc/systemd/system/dev-swap0-swap\x2dmirror.swap'

root@pve:~# systemctl enable 'dev-swap0-swap\x2dmirror.swap'
root@pve:~# systemctl start 'dev-swap0-swap\x2dmirror.swap'

root@pve:~# cat /proc/swaps
Filename                                Type            Size            Used            Priority
/dev/dm-4                               partition       10743804        256             -2
 
Hi!

My method: I don't use dedicated partition for the swap, i use allocated files instead.
( If need more swap, then add more file )

Code:
$> mkdir /swap
$> cd /swap
$> dd if=/dev/zero of=swap1-24G.swap bs=1G count=24
$> mkswap swap1-24G.swap
$> chmod 0400 swap1-24G.swap


Code:
/etc/fstab

/swap/swap1-24G.swap    swap    swap    0 0
 
Hi,

thank you for your quick reply!

/etc/fstab still works and is still my prefered way. Way should you use a more complicated one in favor of a simple, decade-proven way?
You must not ask this to me, I never ever understood why anybody wants systemd at all. I tried to avoid it as long as possible, but I can't anymore, so I try to do the best.
(OT: since it broke my own SSH key agent and I tried to work myself into it, I saw no good. It started to optimize startup speed of Pötterings Laptop, but actually my Devuan VMs boot faster, then it infected more and more of systems userland leading to many bad effects, like user "0day" or hard-coded google DNS resolvers, incomplete support for fstab (nofail wasn't supported in first years, USB devices suddenly needed for reboot), journalctrl pracically slower than files, bad user land tooling, everything is complex, hidden, hard to debug, overengineered and joins the disadvantages from Linux, Windows and Androids OSes, is so unmature and unstable that at least one essential thing changes almost every year, infects "desktop stuff" to our beloved servers (freedesktop.org/XDG specs about having temp cache files in $HOME breaking NFS homes, for no reasons execpt being lost in own complexity droping /usr convention which is still valid on millions of (embedded) Linux systems, strange ideas about enrypting homes, strange concepts such as wayland dropping multi-user support and network transparency, just to have a more powerful "desktop" most linux server systems now include it).)

Yes, or just use a raid controller, if you already have.
I intentionally did not use it (or just in pass-through mode), because the linux implementation seems to be much more compatible (so no issue when changing hardware), has better diagnostics messages, better command line and monitoring tools and appears to handle issues much better; for example, my LSI-whatever RAID takes hours to sync if a disk was off for a few seconds (like pulled wrong from enclousure), but LVM takes only seconds, for example.

I can recommend to use zram as a first instance of swap and then a disk-based swap (maybe also multi-tiered).
Shouldn't I use zswap in this case?
zswap (still?) is disabled by default in Debian (and Proxmox), so I assume for a good reason by people knowing better than me, so apparently it has important disadvantages, I think.
 
Hi,

thank you for your quick reply!

My method: I don't use dedicated partition for the swap, i use allocated files instead.
( If need more swap, then add more file )

Code:
$> mkdir /swap
$> cd /swap
$> dd if=/dev/zero of=swap1-24G.swap bs=1G count=24
$> mkswap swap1-24G.swap
$> chmod 0400 swap1-24G.swap


Code:
/etc/fstab

/swap/swap1-24G.swap    swap    swap    0 0

Thanks, but in my case this would create swap on top of ZFS and thus be unsafe and use the pre-systemd (aka "old") fstab file.

BTW, why did you use dd if=/dev/zero of=swap1-24G.swap bs=1G count=24 instead of fallocate -l 24gb 24G.swap?
 
Hi,

thank you for your quick reply!



Thanks, but in my case this would create swap on top of ZFS and thus be unsafe and use the pre-systemd (aka "old") fstab file.

BTW, why did you use dd if=/dev/zero of=swap1-24G.swap bs=1G count=24 instead of fallocate -l 24gb 24G.swap?
Just old habits :)
Use your favorite tool.

I use ext4 for root, not ZFS.
 
Shouldn't I use zswap in this case?
zswap (still?) is disabled by default in Debian (and Proxmox), so I assume for a good reason by people knowing better than me, so apparently it has important disadvantages, I think.
I use it all over the place on almost everything. Debian is very conservative with enabling new things per default and swap is - as everything seems to be - "it depends". The real goal you need to optimize for is the swapin/swapout rate, not the swap usage. A lot of things in a (Linux) are loaded just once and can be swapped out and read if necessary. You can monitor this yourself with monitoring swapin/swapout rates.
 
I think you want a small amount, maybe 25% of RAM or less. On desktop systems I use 1GB even for 16 or 32 GB of RAM.

The theory here is that swap makes it easier for the system to manage memory. It provides a place to put little-used code when memory is briefly low and allows for the system to have more options. But you don't want to use the swap as a "RAM extension", that kills performance. Much better to add more memory if you need it.

I find that what often happens is that long-running systems will end up with a certain amount of swap usage but aren't actively swapping. That means memory pages that aren't used any more eventually end up in swap rather than taking up RAM. Which is a good thing! That data is not useful and even using that memory as cache is a better use of it.

Such unused memory is really common. Things that are used on startup and never again is normal, almost every program does that to some degree. Then there's the way lots of runtime memory management implementations work, which is when the program calls free()/delete/whatever to give back the memory, the runtime doesn't actually return it to the operating system but holds on to it in case it is needed later. Such blocks may or may not ever get re-used and can easily end up in swap.

So you want some amount of swap but not a lot. The one case where you need a lot of swap is if you plan to hibernate the system. For that you need swap >= RAM size.
 
  • Like
Reactions: sdettmer

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!