[SOLVED] Switch from Default ZFS Swap to dedicated swap drive?

mattlach

Renowned Member
Mar 23, 2016
181
21
83
Boston, MA
Hi all,

So when I installed proxmox to my two main install SSD's in a ZFS mirrored configuration, I did not realize that it would be creating a swap partition on the ZFS drives. (I should have, because one is needed, and where else would it, go, but I just didn't think of it)

My two main SSD's in this mirror are not very high write cycle drives, so for obvious reasons I'd prefer to not have the swap write to them if avoidable.

I have a 128GB dedicated high write endurance SSD I could use instead. I THINK I know how to go about switching to it, but I just wanted to check in here first to make sure I'm not going to break anything.

First, I would use fdisk to partition the disk I want to use and create a linux swap partition on it, filling the entire drive.

Then I would use the "swapoff" command to turn off the existing swap (with vm's and containers shut down, just in case)

Now, I would edit /etc/fstab and change it to point at the new drive I just addad a swap partition on instead of the ZFS swap.

now, I could use the "swapon" command to turn swap back on, and it should be on the new drive.

Since it appears that the ZFS swap is just a dataset on the main pool of my mirror, I should now be able to remove it by using a command like the below:

zfs destroy rpool/swap


So the above is the plan. Is there any reason I can't or shouldn't do this? Could it cause any trouble?

Thanks
Matt
 
  • Like
Reactions: Pato_77
If you are mirroring your boot/data, I recommend mirroring swap as well.
 
So when I installed proxmox to my two main install SSD's in a ZFS mirrored configuration, I did not realize that it would be creating a swap partition on the ZFS drives. (I should have, because one is needed, and where else would it, go, but I just didn't think of it)

Swap doesn't have to be inside the ZFS, and probably would be better off not. The disk is split up into three partitions and the big partition 2 is given to ZFS. This could just as easily have been partitioned with a dedicated swap at the GPT level as well.

One of the worst things to do is start swapping to ZFS when your memory is low, because ZFS is uses so much memory.
 
If you are mirroring your boot/data, I recommend mirroring swap as well.

I'm guessing you'd be concerned about the system going down in case of drive corruption in areas used by active swap?

I'm not terribly concerned about this. If I were running a production high availability server I would, but this is just a home server. I only mirror the boot drive to save my images if something goes wrong.

The way I see it. I'm not mirroring my RAM, so mirroring my swap would be of limited use. That, and I hope to have enough RAM to keep swap use to a minimum.

Looks like a good, and careful, plan to me.

Appreciate the feedback!


Side note:

How do you guys get the subscriber badges next to your names on the forums? I signed up for a subscription a few days ago, and I don't have that badge. I thought I might have to add my subscription key in the forum settings somewhere, but I can't find that option anywhere.
 
Last edited:
Swap doesn't have to be inside the ZFS, and probably would be better off not. The disk is split up into three partitions and the big partition 2 is given to ZFS. This could just as easily have been partitioned with a dedicated swap at the GPT level as well.

Thanks for that, it is very helpful. I knew this to be the case for every Linux distribution I've used. What I didn't know was if I'd break anything proxmox specific if I started tinkering with default swap locations.

One of the worst things to do is start swapping to ZFS when your memory is low, because ZFS is uses so much memory.

This is a very good point. Increased disk use by swapping out of RAM might result in a growing ARC which would further reduce the available RAM, and require more swapping...

That being said, I think the ARC is typically the lowest priority in RAM, and is usually discarded if other programs need it, as it is mostly there for caching purposes.

Another shortcoming in the Proxmox ZFS implementation is that the default installer references the member disks by their dev names (eg /dev/sda, /dev/sdb, etc.) This is not best practice in ZFS as drives can (and do) move. It would probably be better if the default installer either created random UUID's for the partitions, and then used those UUID's to define the pool, or if it just used /dev/disks/by-id/ instead. The latter would likely work as well as UUID's in most cases, as the serial number of the drives are usually in the ID, so they are typically unique, but if - for some reason - a drives firmware did not present the serial number, you COULD run into problems here too, so UUID is probably the best default solution.

On my server I might try - one by one - replacing each of the two members of the pool with the "by-id" version of itself and see if that works. With a little luck it will accept the same drive volume label as a replacement to itself without complaining that it is already in use, as the "zpool labelclear" command seems broken, and I usually wind up having to dd an entire drive with zeroes from /dev/zero in order to clear the label, since zfs places 4 copies of the label spread out on the drive (two close to the start, and two close to the end). This is both time consuming, and wastes my already limited SSD write cycles.
 
For what it is worth, I followed my proposed steps, and after a brief issue in getting the UUID of the newly created swap partition to update in udev, because I mistakenly thought that marking a partition as swap (code 82?) in fdisk automatically ran makeswap, everything worked beautifully.
 
  • Like
Reactions: scurrier and vkhera
Hi all,

So when I installed proxmox to my two main install SSD's in a ZFS mirrored configuration, I did not realize that it would be creating a swap partition on the ZFS drives. (I should have, because one is needed, and where else would it, go, but I just didn't think of it)

My two main SSD's in this mirror are not very high write cycle drives, so for obvious reasons I'd prefer to not have the swap write to them if avoidable.

I have a 128GB dedicated high write endurance SSD I could use instead. I THINK I know how to go about switching to it, but I just wanted to check in here first to make sure I'm not going to break anything.

First, I would use fdisk to partition the disk I want to use and create a linux swap partition on it, filling the entire drive.

Then I would use the "swapoff" command to turn off the existing swap (with vm's and containers shut down, just in case)

Now, I would edit /etc/fstab and change it to point at the new drive I just addad a swap partition on instead of the ZFS swap.

now, I could use the "swapon" command to turn swap back on, and it should be on the new drive.

Since it appears that the ZFS swap is just a dataset on the main pool of my mirror, I should now be able to remove it by using a command like the below:

zfs destroy rpool/swap


So the above is the plan. Is there any reason I can't or shouldn't do this? Could it cause any trouble?

Thanks
Matt
Dear Matt,

Yes, it's an old post. I'm just using Proxmox for about almost a year now and still learning to setup the best way. I really do enjoy the separate SWAP partition you've setup.

Do you mind, explaining how you did it, or perhaps guide me to some manuals that guide me through the process?

I'd appreciate it a lot.
 
Dear Matt,

Yes, it's an old post. I'm just using Proxmox for about almost a year now and still learning to setup the best way. I really do enjoy the separate SWAP partition you've setup.

Do you mind, explaining how you did it, or perhaps guide me to some manuals that guide me through the process?

I'd appreciate it a lot.

Hi, I just saw this post, and it is way too late to start typing a detailed howto guide.

As someone previously in this post posted, if you mirror everything else, it is probably a good idea to mirror swap as well, but swap is not a good fit for ZFS, because you can create a race condition of sorts.

Imagine the following situation:

1.) You run low on RAM.
2.) The system swaps out some active RAM to a swap partition on a ZFS pool.
3.) This causes the ZFS ARC (which is in RAM) to grow.
4.) You are now low on RAM again.
5.) rinse and repeat.

The best way to accomplish this is probably to create your swap as a mirror using something else, other than ZFS.

I did mine by creating a linux MDADM mirror using two drives, and putting a swap partition on the resultant mirror drive.

If you google, there should be plenty of guides regarding this.

I can come back another time when it is not this late, and try to explain it further.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!