Zpools Not Importing After Power Failure

Nov 22, 2020
5
1
3
27
Boston, MA
Hello,

After a power outage, my server is in a state where none of the storage devices are being recognized. Initially, the server would not boot because I believe the zpool cache file was corrupted. I then did the following to get rid of the troublesome cache file:

Code:
mv /etc/zfs/zpool.cache /etc/zfs/zpool.cache.backup


Now, my server boots, but, none of my zpools are loading. I cant even type a simple command like:
Code:
zpool import
because it hangs indefinitely.

The output of dmesg (attached) shows zpool blocking in the kernel and hanging indefinitely with:
Code:
Code: Bad RIP value.


systemctl shows something worrisome when I check the status of "zfs-mount.service":
Code:
● zfs-mount.service - Mount ZFS filesystems
   Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
   Active: inactive (dead)
Condition: start condition failed at Sun 2020-11-22 01:22:38 EST; 17min ago
           └─ ConditionPathIsDirectory=/sys/module/zfs was not met
     Docs: man:zfs(8)

Any ideas? I'd definitely put a beer money bounty on solving this one! I would love to recover some data in one of my VMs that I hadn't set a proper backup for.
 

Attachments

  • dmesg.txt
    148.6 KB · Views: 3
May 31, 2020
120
15
18
the Netherlands
VERIFY3(range_tree_space(smla->smla_rt) + sme->sme_run <= smla->smla_sm->sm_size) failed PANIC at space_map.c:383:space_map_load_callback()
Those messages appear to indicate that the meta-data/structure of the zpool is broken at a low level. I would not know how to repair that without expert knowledge of the ZFS implementation.
I hope I'm wrong and someone can help you recover from this error...
 
  • Like
Reactions: khernand

RobFantini

Renowned Member
May 24, 2012
1,836
58
68
Boston,Mass
does the zfs mount point have directories like dump , template etc?

of so there is a zfs option to fix this, i'll look for it as i used it again a few weeks ago
 
Last edited:
Nov 22, 2020
5
1
3
27
Boston, MA
I booted from a live cd and noticed something odd. This server has two pools, "tank" and "fast-tank". I am able to import "tank" with no problems from the live cd. However, when I try to import "fast-tank", I get the same behavior on the live cd of the zpool command hanging indefinitely.

I guess this must be the "fast-tank" zpool being in a corrupt state. I tried running "sudo zpool import -fFX fast-tank" but that hangs indefinitely as well. This also causes any future commands even as simple as "zpool import" to hang indefinitely. I bet this is what is going on when proxmox boots. I'm not sure where to proceed from here...
 
Last edited:
  • Like
Reactions: avw
May 31, 2020
120
15
18
the Netherlands
Maybe try -F with -n option when trying to import the broken pool? If all else fails, maybe -X. Check man zpool for more information.
Note that I do not know enough about ZFS and/or your specific situation to determine whether those options will help or make problems worse!
Use your own judgement before trying, or ask an expert.
 
Nov 22, 2020
5
1
3
27
Boston, MA
There is a user over here: https://superuser.com/questions/155...c-if-not-imported-as-readonly/1604173#1604173 that seems to be reporting the exact same problem I am facing.

I tried other import options but none seem to be working. I was able to import the pool in readonly mode from another live cd *but* there doesn't seem to be any files present when I "ls /fast-tank". "ls /dev/fast-tank" does show the files but since that is not the mounted pool the files appear to be spread across the 2 drives and I haven't yet tried to recover them by pointing the drives manually (is that even possible?).
 
Nov 22, 2020
5
1
3
27
Boston, MA
Update here: I think I've made some progress. I am able to import both pools form my Ubuntu live CD. As stated before, my pool fast-tank is the troublesome one and can only be imported in readonly mode. My output of zpool status is as follows:

Code:
  pool: fast-tank
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:01:30 with 0 errors on Sun Nov  8 05:25:31 2020
config:

    NAME         STATE     READ WRITE CKSUM
    fast-tank    ONLINE       0     0     0
      mirror-0   ONLINE       0     0     0
        nvme0n1  ONLINE       0     0     0
        nvme1n1  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:00:39 with 0 errors on Sun Nov  8 05:24:41 2020
config:

    NAME                                             STATE     READ WRITE CKSUM
    tank                                             ONLINE       0     0     0
      raidz1-0                                       ONLINE       0     0     0
        ata-Samsung_SSD_870_QVO_4TB_S5VYNG0N700101R  ONLINE       0     0     0
        ata-Samsung_SSD_870_QVO_4TB_S5VYNG0N700705Z  ONLINE       0     0     0
        ata-Samsung_SSD_870_QVO_4TB_S5VYNG0N700683J  ONLINE       0     0     0

errors: No known data errors

So it seems there is no known errors as far as zfs can understand. After importing fast-tank, the directory appears empty. However, I see the VMs listed in:

Code:
ubuntu@ubuntu:~$ ls /dev/fast-tank/
vm-100-disk-0        vm-101-disk-0        vm-103-disk-0
vm-100-disk-0-part1  vm-101-disk-0-part1  vm-103-disk-0-part1
vm-100-disk-0-part2  vm-101-disk-0-part5  vm-103-disk-0-part2
vm-100-disk-0-part3  vm-101-disk-0-part6  vm-103-disk-0-part3

These VMs appear to be saved as zvols and I don't understand how to mount them directly to back them up. I would love to recover these VMs. Any advice?
 
May 31, 2020
120
15
18
the Netherlands
I would expect that you can copy the virtual disk to another zpool and start/backup the VMs from there.
Can you make sure not to mount the fast-tank automatically and then mount it read-only on Proxmox? Maybe you can backup the VMs that way without copying everything.
Can you do a scrub on the read-only pool to check for errors? If you (or the scrub itself) can fix the errors and mount the pool read-write, maybe you don't need to copy.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!