[SOLVED] Help me fix my rpool - need to remove a partition form the pool and need to make another one bootable

jsalas424

Member
Jul 5, 2020
141
2
23
34
Hi all,
While working through this problem here, I realized that I needed to fix the layout of the partition in my rpool before proceeding.

/dev/sdb3 and /dev/sda3 are now mirrored partitions (correct) but sda2 should be a EFI boot partition. I'm trying to remove sda2 from the rpool so

I have sda3 and sdb3 mirrored at rpool, but now I can't remove sda2 from the rpool so that I can fix it with the following commands per the PVE manual:

Code:
sgdisk /dev/sdb2 -R /dev/sda2
sgdisk -G /dev/sda2


When I try to remove sda2 with 'cannot offline /dev/sda2: no valid replicas'. What now?
Code:
root@Server:~# zpool status rpool
pool: rpool
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: resilvered 13.3G in 00:08:35 with 0 errors on Wed Aug 31 18:45:48 2022
config:

NAME        STATE     READ WRITE CKSUM
rpool       ONLINE       0     0     0
mirror-0  ONLINE       0     0     0
    sda3    ONLINE       0     0     0
    sdb3    ONLINE       0     0     0
sda2      ONLINE       0     0     0

errors: No known data errors

root@Server:~# zpool offline rpool /dev/sda2
cannot offline /dev/sda2: no valid replicas

root@Server:~# lsblk -f
NAME     FSTYPE     FSVER LABEL             UUID                                 FSAVAIL FSUSE% MOUNTPOINT
sda                                                                                         
├─sda1                                                                                       
├─sda2   zfs_member 5000  rpool             13990488175257004118                             
└─sda3   zfs_member 5000  rpool             13990488175257004118                             
sdb                                                                                         
├─sdb1                                                                                       
├─sdb2   vfat       FAT32                   EC8D-8512                                       
└─sdb3   zfs_member 5000  rpool             13990488175257004118

1661988793601.png
 
Last edited:
In older ZFS versions is was impossible to remove part of a stripe but if your version is up to date your should be able to run zpool remove rpool /dev/sda2.

EDIT: And you can setup the ESP with proxmox-boot-tool.
 
Last edited:
@leesteken thanks, I was aple to use zpool. I ran into another error once I went to use proxmox-boot-tool

Code:
root@Server:~# proxmox-boot-tool format /dev/sda2 --force
UUID="13990488175257004118" SIZE="536870912" FSTYPE="zfs_member" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Formatting '/dev/sda2' as vfat..
mkfs.fat 4.2 (2021-01-31)
Done.

root@Server:~# proxmox-boot-tool init /dev/sda2
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="2186-3DF6" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Mounting '/dev/sda2' on '/var/tmp/espmounts/2186-3DF6'.
Installing grub i386-pc target..
Installing for i386-pc platform.
Installation finished. No error reported.
Unmounting '/dev/sda2'.
Adding '/dev/sda2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
No /etc/kernel/cmdline found - falling back to /proc/cmdline
Copying and configuring kernels on /dev/disk/by-uuid/2186-3DF6
        Copying kernel 5.13.19-6-pve
        Copying kernel 5.15.39-1-pve
        Copying kernel 5.15.39-3-pve
        Copying kernel 5.15.39-4-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.15.39-4-pve
Found initrd image: /boot/initrd.img-5.15.39-4-pve
Found linux image: /boot/vmlinuz-5.15.39-3-pve
Found initrd image: /boot/initrd.img-5.15.39-3-pve
Found linux image: /boot/vmlinuz-5.15.39-1-pve
Found initrd image: /boot/initrd.img-5.15.39-1-pve
Found linux image: /boot/vmlinuz-5.13.19-6-pve
Found initrd image: /boot/initrd.img-5.13.19-6-pve
done
Copying and configuring kernels on /dev/disk/by-uuid/EC8D-8512
        Copying kernel 5.13.19-6-pve
        Copying kernel 5.15.39-1-pve
        Copying kernel 5.15.39-3-pve
cp: error writing '/var/tmp/espmounts/EC8D-8512/initrd.img-5.15.39-3-pve': No space le

Here is what those disks look like

Code:
root@Server:~# lsblk -f
NAME     FSTYPE     FSVER LABEL             UUID                                 FSAVAIL FSUSE% MOUNTPOINT
sda                                                                                             
├─sda1                                                                                         
├─sda2   vfat       FAT32                   2186-3DF6                                           
└─sda3   zfs_member 5000  rpool             13990488175257004118                               
sdb                                                                                             
├─sdb1                                                                                         
├─sdb2   vfat       FAT32                   EC8D-8512                                           
└─sdb3   zfs_member 5000  rpool             13990488175257004118   

root@Server:~# lsblk
NAME     MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda        8:0    0 232.9G  0 disk
├─sda1     8:1    0  1007K  0 part
├─sda2     8:2    0   512M  0 part
└─sda3     8:3    0 232.4G  0 part
sdb        8:16   0 232.9G  0 disk
├─sdb1     8:17   0  1007K  0 part
├─sdb2     8:18   0   512M  0 part
└─sdb3     8:19   0 232.4G  0 part

Thoughts?
 
cp: error writing '/var/tmp/espmounts/EC8D-8512/initrd.img-5.15.39-3-pve': No space le
Thoughts?
There is not enough space on /dev/sdb2 for all the kernels. Maybe uninstall some of the older kernels? I don't know why proxmox-boot-tool wants to copy so many kernels; usually it's only two or three.
 
There is not enough space on /dev/sdb2 for all the kernels. Maybe uninstall some of the older kernels? I don't know why proxmox-boot-tool wants to copy so many kernels; usually it's only two or three.
Yeah that makes sense, although when I went to remove the kernel I got a dpkg error

First I reboo the entire server, then I reran proxmox-boot-tool again to see if anything happened - no change. So I tried to remove the unused kernel but nope :(

Code:
root@Server:~#  proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/cmdline found - falling back to /proc/cmdline
Copying and configuring kernels on /dev/disk/by-uuid/2186-3DF6
        Copying kernel 5.13.19-6-pve
        Copying kernel 5.15.39-3-pve
        Copying kernel 5.15.39-4-pve
        Removing old version 5.15.39-1-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.15.39-4-pve
Found initrd image: /boot/initrd.img-5.15.39-4-pve
Found linux image: /boot/vmlinuz-5.15.39-3-pve
Found initrd image: /boot/initrd.img-5.15.39-3-pve
Found linux image: /boot/vmlinuz-5.13.19-6-pve
Found initrd image: /boot/initrd.img-5.13.19-6-pve
done
Copying and configuring kernels on /dev/disk/by-uuid/EC8D-8512
        Copying kernel 5.13.19-6-pve
        Copying kernel 5.15.39-3-pve
cp: error writing '/var/tmp/espmounts/EC8D-8512/initrd.img-5.15.39-3-pve': No space left on device


root@Server:~# apt purge pve-kernel-5.13.19-6-pve
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following package was automatically installed and is no longer required:
  pve-kernel-5.15.39-1-pve
Use 'apt autoremove' to remove it.
The following packages will be REMOVED:
  pve-kernel-5.13* pve-kernel-5.13.19-3-pve pve-kernel-5.13.19-6-pve*
0 upgraded, 0 newly installed, 3 to remove and 0 not upgraded.
5 not fully installed or removed.
After this operation, 656 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 142246 files and directories currently installed.)
Removing pve-kernel-5.13 (7.1-9) ...
Removing pve-kernel-5.13.19-3-pve (5.13.19-7) ...
Examining /etc/kernel/postrm.d.
run-parts: executing /etc/kernel/postrm.d/initramfs-tools 5.13.19-3-pve /boot/vmlinuz-5.13.19-3-pve
update-initramfs: Deleting /boot/initrd.img-5.13.19-3-pve
run-parts: executing /etc/kernel/postrm.d/proxmox-auto-removal 5.13.19-3-pve /boot/vmlinuz-5.13.19-3-pve
run-parts: executing /etc/kernel/postrm.d/zz-proxmox-boot 5.13.19-3-pve /boot/vmlinuz-5.13.19-3-pve
Re-executing '/etc/kernel/postrm.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/cmdline found - falling back to /proc/cmdline
Copying and configuring kernels on /dev/disk/by-uuid/2186-3DF6
        Copying kernel 5.15.39-3-pve
        Copying kernel 5.15.39-4-pve
        Copying kernel 5.4.157-1-pve
        Removing old version 5.13.19-6-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.15.39-4-pve
Found initrd image: /boot/initrd.img-5.15.39-4-pve
Found linux image: /boot/vmlinuz-5.15.39-3-pve
Found initrd image: /boot/initrd.img-5.15.39-3-pve
Found linux image: /boot/vmlinuz-5.4.157-1-pve
Found initrd image: /boot/initrd.img-5.4.157-1-pve
done
Copying and configuring kernels on /dev/disk/by-uuid/EC8D-8512
        Copying kernel 5.15.39-3-pve
cp: error writing '/var/tmp/espmounts/EC8D-8512/initrd.img-5.15.39-3-pve': No space left on device
run-parts: /etc/kernel/postrm.d/zz-proxmox-boot exited with return code 1
Failed to process /etc/kernel/postrm.d at /var/lib/dpkg/info/pve-kernel-5.13.19-3-pve.postrm line 14.
dpkg: error processing package pve-kernel-5.13.19-3-pve (--remove):
 installed pve-kernel-5.13.19-3-pve package post-removal script subprocess returned error exit status 1
dpkg: too many errors, stopping
Errors were encountered while processing:
 pve-kernel-5.13.19-3-pve
Processing was halted because there were too many errors.
E: Sub-process /usr/bin/dpkg returned an error code (1)
 
Proxmox/apt automatically runs proxmox-boot-tool after adding or removing kernels, so it's not too surprising. Copying the kernels to /dev/sda2 worked fine. Maybe reformatting/initializing /dev/sdb2 with proxmox-tool-boot works better (and remove the old EC8D-8512 from /etc/kernel/proxmox-boot-uuids afterwards).
 
Proxmox/apt automatically runs proxmox-boot-tool after adding or removing kernels, so it's not too surprising. Copying the kernels to /dev/sda2 worked fine. Maybe reformatting/initializing /dev/sdb2 with proxmox-tool-boot works better (and remove the old EC8D-8512 from /etc/kernel/proxmox-boot-uuids afterwards).
This no longer matters. /dev/sdb2 is the dying disk that needs to be replaced. I was able to boot from /dev/sda alone so problem solved! Thanks for the help.
 
@leesteken , hopefully last question. I can boot from /dev/sda2 fine, but the gdisk info looks weird, I don't want to clone in the new drive without making sure the partition is okay to begin with.

Code:
root@Server:~# gdisk /dev/sda2
GPT fdisk (gdisk) version 1.0.6

Warning: Partition table header claims that the size of partition table
entries is 0 bytes, but this program  supports only 128-byte entries.
Adjusting accordingly, but partition table may be garbage.
Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Caution! After loading partitions, the CRC doesn't check out!
Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: ERROR

Partition table scan:
  MBR: MBR only
  BSD: not present
  APM: not present
  GPT: damaged

Found valid MBR and corrupt GPT. Which do you want to use? (Using the
GPT MAY permit recovery of GPT data.)
 1 - MBR
 2 - GPT
 3 - Create blank GPT

Output of proxmox boot tool

Code:
root@Server:~# proxmox-boot-tool  status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with legacy bios
2186-3DF6 is configured with: grub (versions: 5.15.39-3-pve, 5.15.39-4-pve, 5.4.157-1-pve)
EC8D-8512 is configured with: uefi (versions: 5.3.18-3-pve, 5.4.103-1-pve, 5.4.106-1-pve), grub (versions: 5.13.19-6-pve, 5.15.35-2-pve, 5.15.35-3-pve, 5.15.39-1-pve, 5.15.39-3-pve)

Should I proceed to repair the partiton table with the recovery & transformation menu?Since this is legacy bios I'm guessing it would be keep the MBR table only and copy the CRCs?
 
Looks like the partition table was made with fdisk (which creates an MBR) instead of gdisk (which creates an GPT). Maybe make a GPT on the new disk?
EDIT: Can gdisk convert an MBR to a GPT?
 
Last edited:
Looks like the partition table was made with fdisk (which creates an MBR) instead of gdisk (which creates an GPT). Maybe make a GPT on the new disk?
So would the flow be:
1) Repair MBR table/CRCs on healthy disk
2) Copy to New disk
3) convert new disk to gpt with gdisk?
4) Confirm that can boot: repeat to get GPT on old healthy disk?

Can I use GPT with legacy bios booting?

For context, the dying disk was created with PVE 6.x iso installer

From the man page first paragraph:
"GPT fdisk (aka gdisk) is a text-mode menu-driven program for creation and manipulation of partition tables. It will automatically convert an old-style Master Boot Record (MBR) partition table or BSD disklabel stored without an MBR carrier partition to the newer Globally Unique Identifier (GUID) Partition Table (GPT) format, or will load a GUID partition table ... Upon exiting with the 'w' option, gdisk replaces the MBR or disklabel with a GPT"

So the answer to "Can gdisk convert an MBR to a GPT?" is yes?



Yes gdisk can convert MBT to GPT, see excerpt from gdisk man page for the g option:
https://linux.die.net/man/8/gdisk

'Convert GPT into MBR and exit. This option converts as many partitions as possible into MBR form, destroys the GPT data structures, saves the new MBR, and exits. Use this option if you've tried GPT and find that MBR works better for you. Note that this function generates up to four primary MBR partitions or three primary partitions and as many logical partitions as can be generated. Each logical partition requires at least one unallocated block immediately before its first block. Therefore, it may be possible to convert a maximum of four partitions on disks with tightly-packed partitions; however, if free space was inserted between partitions when they were created, and if the disk is under 2 TiB in size, it should be possible to convert all the partitions to MBR form. See also the 'h' option.'
 
Last edited:
I guess MBR works for your system and I don't think there is an error with your MBR. It's just gdisk that wants a GPT. Maybe best to not make changes to your known to be working good drive. Just manually create the (right size, right type) partitions with gdisk (with a GPT) on the new drive and see if it works.
 
  • Like
Reactions: jsalas424

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!