Now I know what you are all thinking.
IT CAN'T BE DONE. ZFS only allows growing pools, not shrinking them.
But bear with me here.
Background:
I boot this server off of two mirrored NVMe drives. I got a fantastic deal on a set of 500GB Samsung 980 Pro drives when I was rebuilding this server last, and while I knew the caution about consumer drives, I gave in to temptation and went ahead anyway. it's a "PRO" drive, right?
Well, it is not working out.
One of the 500 Samsung 980 Pro drives keeps - roughly every 90-120 days - going non-responsive. It's like the firmware locks up or something. The drive is listed as connected in lspci and the device in /dev/disk/by-id is still there, but it cannot be accessed at all. Smartctl won't even return anything from it.
The last time this happened, a power cycle brought the drive back and it continued working as normal.
This time, since I have a few critical things running I don't want to interrupt for a couple of days, I tried removing the device using the command line and rescanning the PCIe bus with the system running, hoping it would come back up. The rescan detects the device, but dmesg says "Device not ready", and doesn't re-atttach it.
It was worth a try.
Anyway, so I happen to have this set of enterprise Optane drives I am not using, and I figured I'd swap them in, but unfortunately while the Samsung drives are 500GB, the Optane drives are only 375GB, and ZFS does not support shrinking pools, only growing them.
This leads me to my crazy little plan:
Looking at the one remaining member of rpool in the system, the disk layout looks like this:
______________________________________________________________________________________________________________
What if I powered down the server, removed these two drives from the server, installed them in another system, did a "zpool import", followed by a block level backup using zfs send/recv.
I copy (using dd or something like that) the BIOS Boot and EFI partitions to my smaller Optane drives, create partitions using the remaining free space on them, create a new pool named rpool, and restore my backup using zfs send/recv to these new smaller drives.
Would this work?
I would certainly have disks with the same BIOS Boot and EFI system partitions.
I'd also have a new rpool with the same content as before (but smaller)
I'm guessing there is likely more to it than this though.
How does th eproxmox-boot-tool identify the disk to boot from? Does it go by pool GUID?
I imagine I'd at least need to make sure that both the individual disks and the new rpool have the same GUID as the old ones for it to match what proxmox-boot-tool expects to see?
Could I use "zfs set guid=" to set the guid of the new rpool to the same value as the old rpool and have it properly boot from it?
______________________________________________________________________________________________________________
Am I crazy for even considering this, or should I just do a clean install of Proxmox onto the new drives?
After doing so, and having it properly configure the new drives, maybe I could even zfs send/recv the content of the old rpool to the new rpool maybe I could even have the boot drive data be identical, and not have to do any migration/restore?
______________________________________________________________________________________________________________
Maybe the easiest solution would just be to buy a couple of enterprise grade 500GB+ NVMe drives and swap them in using zpool replace and be done with it?
I'd appreciate any suggestions.
Hindsight being 20-20, I kind of wish I had created the boot pool using a small portion of the 500GB drives space to make this kind of replacement easier.
I use these two drives just to boot and for the Debian/Proxmox system, no VM storage. The install only uses ~12GB on the drive... Lesson learned for next time I guess.
IT CAN'T BE DONE. ZFS only allows growing pools, not shrinking them.
But bear with me here.
Background:
I boot this server off of two mirrored NVMe drives. I got a fantastic deal on a set of 500GB Samsung 980 Pro drives when I was rebuilding this server last, and while I knew the caution about consumer drives, I gave in to temptation and went ahead anyway. it's a "PRO" drive, right?
Well, it is not working out.
One of the 500 Samsung 980 Pro drives keeps - roughly every 90-120 days - going non-responsive. It's like the firmware locks up or something. The drive is listed as connected in lspci and the device in /dev/disk/by-id is still there, but it cannot be accessed at all. Smartctl won't even return anything from it.
The last time this happened, a power cycle brought the drive back and it continued working as normal.
This time, since I have a few critical things running I don't want to interrupt for a couple of days, I tried removing the device using the command line and rescanning the PCIe bus with the system running, hoping it would come back up. The rescan detects the device, but dmesg says "Device not ready", and doesn't re-atttach it.
It was worth a try.
Anyway, so I happen to have this set of enterprise Optane drives I am not using, and I figured I'd swap them in, but unfortunately while the Samsung drives are 500GB, the Optane drives are only 375GB, and ZFS does not support shrinking pools, only growing them.
This leads me to my crazy little plan:
Looking at the one remaining member of rpool in the system, the disk layout looks like this:
Code:
root@proxmox:~# fdisk -l /dev/nvme15n1
Disk /dev/nvme15n1: 465.76 GiB, 500107862016 bytes, 976773168 sectors
Disk model: Samsung SSD 980 PRO 500GB
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3D6C5911-E7F6-42A3-80C8-106CE10A0603
Device Start End Sectors Size Type
/dev/nvme15n1p1 34 2047 2014 1007K BIOS boot
/dev/nvme15n1p2 2048 2099199 2097152 1G EFI System
/dev/nvme15n1p3 2099200 976773134 974673935 464.8G Solaris /usr & Apple ZFS
______________________________________________________________________________________________________________
What if I powered down the server, removed these two drives from the server, installed them in another system, did a "zpool import", followed by a block level backup using zfs send/recv.
I copy (using dd or something like that) the BIOS Boot and EFI partitions to my smaller Optane drives, create partitions using the remaining free space on them, create a new pool named rpool, and restore my backup using zfs send/recv to these new smaller drives.
Would this work?
I would certainly have disks with the same BIOS Boot and EFI system partitions.
I'd also have a new rpool with the same content as before (but smaller)
I'm guessing there is likely more to it than this though.
How does th eproxmox-boot-tool identify the disk to boot from? Does it go by pool GUID?
I imagine I'd at least need to make sure that both the individual disks and the new rpool have the same GUID as the old ones for it to match what proxmox-boot-tool expects to see?
Could I use "zfs set guid=" to set the guid of the new rpool to the same value as the old rpool and have it properly boot from it?
______________________________________________________________________________________________________________
Am I crazy for even considering this, or should I just do a clean install of Proxmox onto the new drives?
After doing so, and having it properly configure the new drives, maybe I could even zfs send/recv the content of the old rpool to the new rpool maybe I could even have the boot drive data be identical, and not have to do any migration/restore?
______________________________________________________________________________________________________________
Maybe the easiest solution would just be to buy a couple of enterprise grade 500GB+ NVMe drives and swap them in using zpool replace and be done with it?
I'd appreciate any suggestions.
Hindsight being 20-20, I kind of wish I had created the boot pool using a small portion of the 500GB drives space to make this kind of replacement easier.
I use these two drives just to boot and for the Debian/Proxmox system, no VM storage. The install only uses ~12GB on the drive... Lesson learned for next time I guess.
Last edited: