ZFS Boot Drive Resize in Proxmox

nave · Nov 23, 2023

Hello!

I had setup proxmox on a 250gb SSD to try it out, and I was able to eventually clone the drive to a larger 1tb nvme drive using "dd" command. Since I was using ZFS, I could not resize the remaining free space on the nvme drive using gparted, so I used the following command when I booted to Proxmox from the NVME

Code:

parted /dev/nvme0n1 resizepart 3 99%

Everything under terminal looked as it should, but I noticed inside of Proxmox it would show the original SSD's free space size. I thought it would not be a problem.

Last night I was setting up a ZFS snapshot inside a vm (using qemu partition from the nvme), and all of the sudden everything crashed. I thought that the VM would continue on the background doing snapshot, but this morning I can not get inside the Proxmox GUI. Proxmox thinks the drive is full and no more space.

Since I have lost access to the GUI, all I have is access to terminal.

Any idea on how to increase the boot drive size that proxmox sees as full? I did a search and most that I have seen is related to VM size increase, so have not been able to figure this out. Also, would creating a new usb iso with proxmox installation, give me any options without loosing data?

Please keep in mind that I lost remote access to Proxmox since pfsense was running inside proxmox, and the terminal that I have access to is directly from the bare metal, so copy and pasting might be difficult.

thanks for the help!

LnxBil · Nov 23, 2023

Have you tried this?

nave · Nov 23, 2023

LnxBil said:
Have you tried this?

I just tried.

I was able to execute the first command - zpool set autoexpand=on <name of pool>

however, I could not export the pool, either with "zpool export rpool", or "zpool export -f rpool", in order for it to import to trigger the resize. "Pool or dataset is busy"

I then followed the last post - zpool online -e rpool nvmeon1p3

janssensm said:
zpool online -e ata-ST6000NM021A-2R7101_WRG06H4D

and something happened but not sure what.

Now if I reboot within a minute everything freezes.

I was able to connect an old router and I able to SSH, but it freeze very quick. I was able to pull the following:

Code:

root@pve:~# zpool status
  pool: rpool
 state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
  scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:

    NAME         STATE     READ WRITE CKSUM
    rpool        ONLINE       0     0     0
      nvme0n1p3  ONLINE       3     0     0
errors: List of errors unavailable: pool I/O is currently suspended

errors: 22 data errors, use '-v' for a list
root@pve:~# zpool list -v
NAME          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool         920G   215G   705G        -         -    13%    23%  1.00x  SUSPENDED  -
  nvme0n1p3   921G   215G   705G        -         -    13%  23.4%      -    ONLINE

LnxBil · Nov 23, 2023

Hmm ... that looks not so good.

Try to run zpool status -v the next time in order to see what errors there are.

nave · Nov 23, 2023

LnxBil said:
Hmm ... that looks not so good.

Try to run zpool status -v the next time in order to see what errors there are.

root@pve:~# zpool status -v
pool: rpool
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:

NAME STATE READ WRITE CKSUM
rpool UNAVAIL 0 0 0 insufficient replicas
nvme0n1p3 UNAVAIL 0 0 0

errors: List of errors unavailable: pool I/O is currently suspended

LnxBil · Nov 23, 2023

nave said:
errors: List of errors unavailable: pool I/O is currently suspended

Hmm ... now all devices are unavailable. Can you post the output of dmesg?

nave · Nov 23, 2023

LnxBil said:
Hmm ... now all devices are unavailable. Can you post the output of dmesg?

Thanks for your help! It was pretty long, so it would not post. I have attached it to a txt document. Let me know if that works.

Thanks again!

LnxBil · Nov 23, 2023

nave said:
I have attached it to a txt document. Let me know if that works.

looks normal - no errors.

You may need to boot the PVE most recent install disk and choose rescue mode to get into your system again, try to import the pool there and then try to inspect what is going on. The pool had I/O errors, so we need to find out what's corrupt and maybe fix it.

PS: Do you have backups of everything relevant?

nave · Nov 23, 2023

LnxBil said:
looks normal - no errors.

You may need to boot the PVE most recent install disk and choose rescue mode to get into your system again, try to import the pool there and then try to inspect what is going on. The pool had I/O errors, so we need to find out what's corrupt and maybe fix it.

PS: Do you have backups of everything relevant?

I have backups of some VMs from a couple of weeks back, so those are current. I don't remember if I did backups of containers. However, I recently started migrating docker apps from Unraid to a Proxmox VM, and those I have no backups, and I have since deleted them from unraid.

Let me work on the install disk, and start there.

Thanks again!

nave · Nov 23, 2023

I am getting the following error using the Proxmox installation usb, and clicking on rescue boot.

ERROR: unable to find disk automatically

LnxBil · Nov 24, 2023

nave said:
ERROR: unable to find disk automatically

Maybe the same problem as your PVE install.

Now, we run out of options. Without a rescue system to inspect what's wrong, it'll be very hard.
Do you still have the old SSD from where you migrated?

nave · Nov 24, 2023

Just got it working again! I am not sure why, but the i/o error was related to a PCI SATA Controller. I had originally disconnected the controller as I could not get passed the HP logo while booting. After reading zpool status -v action: "Make sure the affected devices are connected, then run 'zpool clear'." - Decided to connect the PCI Controller and everything booted normally.

Code:

root@pve:~# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:


    NAME         STATE     READ WRITE CKSUM
    rpool        ONLINE       0     0     0
      nvme0n1p3  ONLINE       0     0     0


errors: No known data errors

I am also seeing the correct storage size on the GUI, so the first steps actually worked.

Thanks again for your time and help!

LnxBil · Nov 24, 2023

Please run a scrub to check if really everything is OK, the last one is from 12 days ago.

Search

Search

ZFS Boot Drive Resize in Proxmox

nave

New Member

LnxBil

Distinguished Member

nave

New Member

LnxBil

Distinguished Member

nave

New Member

LnxBil

Distinguished Member

nave

New Member

Attachments

LnxBil

Distinguished Member

nave

New Member

nave

New Member

LnxBil

Distinguished Member

nave

New Member

LnxBil

Distinguished Member

We value your privacy