[SOLVED] high swap usage and a way to add swap.

RobFantini

Famous Member
May 24, 2012
2,009
102
133
Boston,Mass
at pve > node > summary > node , swap usage has been at 90%+ on a few nodes . memory was in all cases at less the 50% .

I've noticed this just within the last month. We were busy with other issues so we just add swap. note we used to use zram , however had issues under some circumstances. to to add swap [ and there are other ways some better ]

* add swap on zfs rpool
Code:
zfs create -V 10G rpool/swap1
mkswap /dev/zvol/rpool/swap1
swapon /dev/zvol/rpool/swap1
# add to /etc/fstab :
echo "/dev/zvol/rpool/swap1 none swap sw 0 0" >> /etc/fstab
* ext4
Code:
fallocate -l 8G /swapfile
chmod 0600 /swapfile

chown root:root /swapfile
mkswap /swapfile

swapon /swapfile
echo "/swapfile swap swap defaults 0 0" >> /etc/fstab


Now why swap usage got to very high usage is something to debug and fix. when I am caught up on projects we'll try removing extra swap on a test system etc.

Has anyone else seen swap usage very high?
 
You can look host with "journalctl -xe" and with "dmesg -wH" if kernel not tell to you I was experiance this problem memory usage will be normal. I want ask to you one question, did you use ksm ? Normally Proxmox already coming with memory deduplication feature but also you can grow up "KSM_NPAGES_MAX" and "KSM_THRES_COEF"

If you will use generally same operating system on your host KSM will work more than powerful.
 
You can look host with "journalctl -xe" and with "dmesg -wH" if kernel not tell to you I was experiance this problem memory usage will be normal. I want ask to you one question, did you use ksm ? Normally Proxmox already coming with memory deduplication feature but also you can grow up "KSM_NPAGES_MAX" and "KSM_THRES_COEF"

If you will use generally same operating system on your host KSM will work more than powerful.

Hello
thanks for the answer,
and where are "KSM_NPAGES_MAX" and "KSM_THRES_COEF" set?
 
Code:
cat /etc/ksmtuned.conf
# Configuration file for ksmtuned.

# How long ksmtuned should sleep between tuning adjustments
# KSM_MONITOR_INTERVAL=60

# Millisecond sleep between ksm scans for 16Gb server.
# Smaller servers sleep more, bigger sleep less.
# KSM_SLEEP_MSEC=100

# KSM_NPAGES_BOOST=1000
# KSM_NPAGES_DECAY=-50
# KSM_NPAGES_MIN=64
 KSM_NPAGES_MAX=20000

 KSM_THRES_COEF=80
# KSM_THRES_CONST=2048

# uncomment the following if you want ksmtuned debug info

 LOGFILE=/var/log/syslog
 DEBUG=1

KSM is a memory dedeuplication system for KVM, so you should be use that. On that example configuration KSM was start when you have%80 percent free memory and scan until 20K page ( also this is means IO for cpu so if you not have more powerfull CPU then downgrade ) also create LOG at debug level1 and write log to /var/log/syslog file so you can watch what is going on KSM with " tail -f /var/log/syslog" comment
 
We were busy with other issues so we just add swap.

<smartass>
A better way would be add RAM :-D
</smartass>

Adding Swap is never a good option, because your system will become slower and slower.

I hear you, we also ran into such problem on non-pve machines and we ended up in monitoring the swap-in rate so that we could detect if we're hitting some memory wall. I recommend using a monitoring system that also allows storing and analysing metrics so that you can find out when things started to get ugly. Normally, the overall response times will drop significantly if you started to use swap regularly and this is to avoid at all cost.
 
thank you @ertanerbek . I'll try using KSM after current projects..

KSM is enabled in PVE per default, so no need to activate it. It also comes with PVE best practise values, so only tune "if you know what you're doing". If you want to pack as most VMs as possible on your system and want to use KSM, it is crucial that you have a lot of overlap in operating system (including the exact patch level) so that it has a change to work. So update and reboot your guest regularly in order to get the most out of KSM.

Another way would be moving to containers, if you have a lot of Linux machines. KSM is not useful there, but you will share the same kernel, which will probably safe you 50-150 MB per VM in memory.
 
  • Like
Reactions: RobFantini
<smartass>
A better way would be add RAM :-D
</smartass>

Adding Swap is never a good option, because your system will become slower and slower.

I hear you, we also ran into such problem on non-pve machines and we ended up in monitoring the swap-in rate so that we could detect if we're hitting some memory wall. I recommend using a monitoring system that also allows storing and analysing metrics so that you can find out when things started to get ugly. Normally, the overall response times will drop significantly if you started to use swap regularly and this is to avoid at all cost.

Hi there

in the last year we increased ram from 64 to 94 on most systems.

the system that had the last swap issue has 204 GB of ram, and per pve stats on the average - for the last month has less then 20% usage.

the high swap usage could have been a false alarm. or not. you have heard the stories about the monster that goes around scaring systems to death by breathing in all the ram in the room? well more swap is like garlic to those vampires.
 
High swap usage is not a bad thing. Under normal operations kernel may decide to move things to swap to keep more ram available. Also high swap does not mean slower systems, if those memory pages are not accessed frequently. I have been using this command to find what is using most of the swap space for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | less

Also check vmstat 1, especially the si/so columns if there is any swap input/output activity. To my opinion swap is a good thing and keeps our systems healthy and virtual memory happy.
 
the high swap usage could have been a false alarm. or not. you have heard the stories about the monster that goes around scaring systems to death by breathing in all the ram in the room? well more swap is like garlic to those vampires.

Have you checked the swappiness of the systems? Maybe they start swapping very early.
 
High swap usage is not a bad thing. Under normal operations kernel may decide to move things to swap to keep more ram available. Also high swap does not mean slower systems, if those memory pages are not accessed frequently.

Yes, it depends heavily on the system that run there, but what are systems good for if they are not used? I have none of them on my production servers. We have them however in your test machines and background work horses.
 
High swap usage is not a bad thing. Under normal operations kernel may decide to move things to swap to keep more ram available. Also high swap does not mean slower systems, if those memory pages are not accessed frequently. I have been using this command to find what is using most of the swap space for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | less

Also check vmstat 1, especially the si/so columns if there is any swap input/output activity. To my opinion swap is a good thing and keeps our systems healthy and virtual memory happy.

thanks in the future I'll check that.

however there is something unusual going on when 3 or 4 systems out of 7 , each on the average using less then 20% ram use 90% swap. I've manually checked systems a few times per week for 20-25 years , usually have something like zabbix or other tools in the old day to report issues. only in the last few months have seen such swap activity - and just at the pve screen.

perhaps a swap usage graph history would be a good item to add to the pve screens.

again thanks for your suggestion.
 
Yes, it depends heavily on the system that run there, but what are systems good for if they are not used? I have none of them on my production servers. We have them however in your test machines and background work horses.

I agree, and I like how ZFS uses memory. Same with Windows. from what i have seen - up to a point no matter how much ram is added , 85-90% is used.

Note that we have a 7 system cluster for running ceph very reliably. for the amount of vm's we use - 5 would be over kill.
regarding the high swap usage - i do not think it is normal on a system using 40gb of 204gb ram.

one reason i started this thread is to call it to other pve operators attention.
 
Yes, it depends heavily on the system that run there, but what are systems good for if they are not used? I have none of them on my production servers. We have them however in your test machines and background work horses.

@LnxBil That's a good point and i agree that it may seems like misused resources, but the idea of the swap is to keep systems healthy and not to be used as a ram tradeoff. Think of all those pages that kernel read on boot from disk, passed to ram and then none of those are going to be used again. Would you like those to keep space on your pricey ram or would you rather have those moved to a spacey disk?
 
Think of all those pages that kernel read on boot from disk, passed to ram and then none of those are going to be used again. Would you like those to keep space on your pricey ram or would you rather have those moved to a spacey disk?

If they are memory mapped, they can be removed from the memory completely without any negative impact on caching. That is the golden way for everything, but you're right for space that lies around in kernel memory. Therefore it is at upmost importance that you have a very slim hypervisor that does not need RAM at all. Best to build everything from scratch (tried that a few times and you need to be very desperate) or just live with some memory waste. I use also zram all over the place to be able to just swap out some stuff. Therefore I wrote earlier that it is important to monitor the swap-ins/out, not the swap usage as a whole.
 
i do not think it is normal on a system using 40gb of 204gb ram.

Of course not. It seems that one (or some) process(es) went berserk and required a lot of memory and yielded the system you have right now. That is a draw back of not-having memory constraints on everything. Maybe we'll get that in the future in Linux/PVE to limit every process easily (the hard way is possible by using control groups). I have also no idea how to deal with this in a good way.

A way that Oracle uses in its Trace File Analyser (TFA) is to output some system default programs like ps, top, iostat, netstat etc. to a file for later analysis every 5 minutes (interval configurable) so that you can get a hint on what process was behaving strangely even after the process was killed or exited after wasting enormous amounts of space.
 
so we've found that our x9 cpu systems use a high percentage swap , even with 200GB ram. the x10 systems have 64-96GB ram and never go over 2 swap usage.

could be something to do with the x9's or who knows what. in any case we are upgrading the old x9's soon [ as scheduled not to do with swap].
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!