zfs zpool error during check - can not boot anymore

tebse

Member
Jun 22, 2018
43
0
11
54
Hi Proxmox Support Forum,

I am an inexperienced user with limited Linux knowledge. I am running PVE for approx 2 1/2 years, I did an upgrade to 6.2 (?) some month ago, so this should be my current version.

This night I received an error email:

Code:
The number of I/ errors associated with a ZFS device exceeded acceptable levels. ZFS has marked the device as faulted.
impact: Fault tolerance of the pool may be compromised.
...
vpath /dev/sdd1
...

sdd is an SDD "only" used for log purposes.

Stupid me - as usual when I expect an upgrade / repair / reboot - I started an apt upgrade of the node from the web fronted. This made the web frontend freeze, but ssh was still available. From there I send of

Code:
zpool status

and found the sdd1 as broken. I tried

Code:
zpool remove rpool sdd1

which made the complete system unreactive / freeze. In a reboot, I got this here:

Foto 01.11.20, 09 37 10.jpg

I followed post 2 from @fabian from this thread:
https://forum.proxmox.com/threads/unable-to-boot-from-zfs-rpool-after-upgrading-to-pve-4-2.27222/

Code:
zpool import
zpool import -m rpool

but all I got is this here:


Foto 01.11.20, 09 49 28.jpgFoto 01.11.20, 09 49 21.jpg

I kindly ask for help.
Thank you
Thorsten
 

Attachments

  • Foto 01.11.20, 09 37 10.jpg
    Foto 01.11.20, 09 37 10.jpg
    375.8 KB · Views: 1
  • Foto 01.11.20, 09 49 21.jpg
    Foto 01.11.20, 09 49 21.jpg
    450.1 KB · Views: 2
Last edited:
OK some success:

Code:
zpool import -f

gave an error but after a second reboot I was able to do:

Code:
zpool import -m rpool
exit

Back on SSH I could remove the defect ssd by:
Code:
 zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 0 days 07:08:40 with 0 errors on Sun Oct 11 07:32:43 2020
config:

        NAME                   STATE     READ WRITE CKSUM
        rpool                  DEGRADED     0     0     0
          raidz1-0             ONLINE       0     0     0
            sda2               ONLINE       0     0     0
            sdb2               ONLINE       0     0     0
            sdc2               ONLINE       0     0     0
        logs
          1023037337071760466  UNAVAIL      0     0     0  was /dev/sdd1

errors: No known data errors
root@ebb-vn01:~# zpool remove rpool 1023037337071760466
root@ebb-vn01:~# zpool status
  pool: rpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 0 days 07:08:40 with 0 errors on Sun Oct 11 07:32:43 2020
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0
            sdc2    ONLINE       0     0     0

errors: No known data errors
 
I guess, now I can replace the SSD and reassign LOG / CACHE :-)

Next question - I used the SSD /dev/ssd2 to have Swap on it, here is my fstab:

Code:
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/sdd2 none swap sw 0 0
# /dev/zvol/rpool/swap none swap sw 0 0
proc /proc proc defaults 0 0

Now I would like to return to

Code:
# <file system> <mount point> <type> <options> <dump> <pass>
# /dev/sdd2 none swap sw 0 0
/dev/zvol/rpool/swap none swap sw 0 0
proc /proc proc defaults 0 0

Please notice the position of the "#" until I replaced the SSD. As
Code:
root@ebb-vn01:~# zpool status
  pool: rpool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 0 days 07:08:40 with 0 errors on Sun Oct 11 07:32:43 2020
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0
            sdc2    ONLINE       0     0     0

errors: No known data errors

Did not show an SWAP - how can I temporarily create and set a SWAP?

Thank you and best regards
Thorsten
 
I guess, now I can replace the SSD and reassign LOG / CACHE :)

Next question - I used the SSD /dev/ssd2 to have Swap on it, here is my fstab:

Code:
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/sdd2 none swap sw 0 0
# /dev/zvol/rpool/swap none swap sw 0 0
proc /proc proc defaults 0 0

Now I would like to return to

Code:
# <file system> <mount point> <type> <options> <dump> <pass>
# /dev/sdd2 none swap sw 0 0
/dev/zvol/rpool/swap none swap sw 0 0
proc /proc proc defaults 0 0

Please notice the position of the "#" until I replaced the SSD. As
Code:
root@ebb-vn01:~# zpool status
  pool: rpool
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 0 days 07:08:40 with 0 errors on Sun Oct 11 07:32:43 2020
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0
            sdc2    ONLINE       0     0     0

errors: No known data errors

Did not show an SWAP - how can I temporarily create and set a SWAP?

Thank you and best regards
Thorsten

replace 8G with desired swap size

Code:
#zfs destroy rpool/swap

zfs create -V 8G -b $(getconf PAGESIZE) -o compression=zle \
  -o logbias=throughput -o sync=always \
  -o primarycache=metadata -o secondarycache=none \
  -o com.sun:auto-snapshot=false rpool/swap

mkswap -f /dev/zvol/rpool/swap

cat << 'EOF' >> /etc/fstab
/dev/zvol/rpool/swap none swap defaults 0 0
EOF

swapon -av
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!