ZFS Boot Drive Resize in Proxmox

nave

New Member
Nov 23, 2023
7
1
3
Hello!

I had setup proxmox on a 250gb SSD to try it out, and I was able to eventually clone the drive to a larger 1tb nvme drive using "dd" command. Since I was using ZFS, I could not resize the remaining free space on the nvme drive using gparted, so I used the following command when I booted to Proxmox from the NVME

Code:
parted /dev/nvme0n1 resizepart 3 99%

Everything under terminal looked as it should, but I noticed inside of Proxmox it would show the original SSD's free space size. I thought it would not be a problem.

Last night I was setting up a ZFS snapshot inside a vm (using qemu partition from the nvme), and all of the sudden everything crashed. I thought that the VM would continue on the background doing snapshot, but this morning I can not get inside the Proxmox GUI. Proxmox thinks the drive is full and no more space.

Since I have lost access to the GUI, all I have is access to terminal.

Any idea on how to increase the boot drive size that proxmox sees as full? I did a search and most that I have seen is related to VM size increase, so have not been able to figure this out. Also, would creating a new usb iso with proxmox installation, give me any options without loosing data?

Please keep in mind that I lost remote access to Proxmox since pfsense was running inside proxmox, and the terminal that I have access to is directly from the bare metal, so copy and pasting might be difficult.

thanks for the help!
 
Last edited:
Have you tried this?
I just tried.

I was able to execute the first command - zpool set autoexpand=on <name of pool>


however, I could not export the pool, either with "zpool export rpool", or "zpool export -f rpool", in order for it to import to trigger the resize. "Pool or dataset is busy"

I then followed the last post - zpool online -e rpool nvmeon1p3
zpool online -e ata-ST6000NM021A-2R7101_WRG06H4D
and something happened but not sure what.

Now if I reboot within a minute everything freezes.

I was able to connect an old router and I able to SSH, but it freeze very quick. I was able to pull the following:

Code:
root@pve:~# zpool status
  pool: rpool
 state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
  scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:

    NAME         STATE     READ WRITE CKSUM
    rpool        ONLINE       0     0     0
      nvme0n1p3  ONLINE       3     0     0
errors: List of errors unavailable: pool I/O is currently suspended

errors: 22 data errors, use '-v' for a list
root@pve:~# zpool list -v
NAME          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool         920G   215G   705G        -         -    13%    23%  1.00x  SUSPENDED  -
  nvme0n1p3   921G   215G   705G        -         -    13%  23.4%      -    ONLINE
 
Hmm ... that looks not so good.

Try to run zpool status -v the next time in order to see what errors there are.
root@pve:~# zpool status -v
pool: rpool
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:

NAME STATE READ WRITE CKSUM
rpool UNAVAIL 0 0 0 insufficient replicas
nvme0n1p3 UNAVAIL 0 0 0

errors: List of errors unavailable: pool I/O is currently suspended
 
Hmm ... now all devices are unavailable. Can you post the output of dmesg?
Thanks for your help! It was pretty long, so it would not post. I have attached it to a txt document. Let me know if that works.

Thanks again!
 

Attachments

I have attached it to a txt document. Let me know if that works.
looks normal - no errors.

You may need to boot the PVE most recent install disk and choose rescue mode to get into your system again, try to import the pool there and then try to inspect what is going on. The pool had I/O errors, so we need to find out what's corrupt and maybe fix it.

PS: Do you have backups of everything relevant?
 
looks normal - no errors.

You may need to boot the PVE most recent install disk and choose rescue mode to get into your system again, try to import the pool there and then try to inspect what is going on. The pool had I/O errors, so we need to find out what's corrupt and maybe fix it.

PS: Do you have backups of everything relevant?
I have backups of some VMs from a couple of weeks back, so those are current. I don't remember if I did backups of containers. However, I recently started migrating docker apps from Unraid to a Proxmox VM, and those I have no backups, and I have since deleted them from unraid.

Let me work on the install disk, and start there.

Thanks again!
 
I am getting the following error using the Proxmox installation usb, and clicking on rescue boot.

ERROR: unable to find disk automatically
 
Just got it working again! I am not sure why, but the i/o error was related to a PCI SATA Controller. I had originally disconnected the controller as I could not get passed the HP logo while booting. After reading zpool status -v action: "Make sure the affected devices are connected, then run 'zpool clear'." - Decided to connect the PCI Controller and everything booted normally.

Code:
root@pve:~# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 00:05:34 with 0 errors on Sun Nov 12 00:29:36 2023
config:


    NAME         STATE     READ WRITE CKSUM
    rpool        ONLINE       0     0     0
      nvme0n1p3  ONLINE       0     0     0


errors: No known data errors

I am also seeing the correct storage size on the GUI, so the first steps actually worked.

Thanks again for your time and help!
 
  • Like
Reactions: zodiac