[SOLVED] Second ZFS pool failed to import on boot

mstefan · Jan 3, 2022

Hello,

i have 2 zfs pools on my machine (PVE 7.1):

rpool (Mirror SSD)
datapool (Mirror HDD)

Everytime i boot up my machine, i get an error, the import of datapool failed.
A view in the syslog shows everytime the same enty:

Code:

Jan  3 13:31:29 pve systemd[1]: Starting Import ZFS pool datapool...
Jan  3 13:31:29 pve zpool[1642]: cannot import 'datapool': no such pool available
Jan  3 13:31:29 pve systemd[1]: zfs-import@datapool.service: Main process exited, code=exited, status=1/FAILURE
Jan  3 13:31:29 pve systemd[1]: zfs-import@datapool.service: Failed with result 'exit-code'.
Jan  3 13:31:29 pve systemd[1]: Failed to start Import ZFS pool datapool.
Jan  3 13:31:29 pve zed: eid=7 class=config_sync pool='datapool'
Jan  3 13:31:29 pve zed: eid=8 class=pool_import pool='datapool'
Jan  3 13:31:29 pve zed: eid=10 class=config_sync pool='datapool'

I already tried:

adding rootdelay=10 to /etc/defaults/grub
adding rootdelay=10 to /etc/kernel/cmdline
adding ZFS_INITRD_PRE_MOUNTROOT_SLEEP='5' and ZFS_INITRD_POST_MODPROBE_SLEEP='5' to /etc/defaults/zfs

and refreshed with pve-efiboot-tool refresh.
None of these helped.

When PVE is up and running, the pool is online and everything ist fine.

Is this something i have to worry about?
How can i fix this error message?

Thanks for your help!

oguz · Jan 3, 2022

hi,

can you check the following:

Code:

systemctl status zfs-import.service zfs-import-cache.service

if the zfs-import-cache service is enabled, then maybe your zfs cachefile is corrupt. you could try:

Code:

zpool set cachefile=/etc/zfs/zpool.cache <YOURPOOL> # do this for all your pools
update-initramfs -k all -u
reboot

or alternatively you can use the zfs-import service (can cause slower boot time since it needs to scan the drives without cache) instead of zfs-import-cache: systemctl disable zfs-import-cache.service && systemctl enable zfs-import.service followed by a reboot should hopefully fix the issue.

mstefan · Jan 3, 2022

Thanks for your reply.

Here is the output:

Code:

systemctl status zfs-import.service zfs-import-cache.service
● zfs-import.service
     Loaded: masked (Reason: Unit zfs-import.service is masked.)
     Active: inactive (dead)

● zfs-import-cache.service - Import ZFS pools by cache file
     Loaded: loaded (/lib/systemd/system/zfs-import-cache.service; enabled; vendor preset: enabled)
     Active: active (exited) since Mon 2022-01-03 13:31:29 CET; 1h 26min ago
       Docs: man:zpool(8)
   Main PID: 1641 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 76969)
     Memory: 0B
        CPU: 0
     CGroup: /system.slice/zfs-import-cache.service

Jan 03 13:31:28 pve systemd[1]: Starting Import ZFS pools by cache file...
Jan 03 13:31:29 pve systemd[1]: Finished Import ZFS pools by cache file.

Setting the cachefile and updating initramsfs didnt change anything.
The exact error on the bootscreen is:
[FAILED] Failed to start Import ZFS pool datapool.

Disabling cache and activating import throws an error:

Code:

systemctl disable zfs-import-cache.service && systemctl enable zfs-import.service
Removed /etc/systemd/system/zfs-import.target.wants/zfs-import-cache.service.
Failed to enable unit: Unit file /lib/systemd/system/zfs-import.service is masked.

oguz · Jan 3, 2022

mstefan said:
Failed to enable unit: Unit file /lib/systemd/system/zfs-import.service is masked.

ah sorry. you can try like: systemctl enable zfs-import@POOLNAME and it should be enabled.

if after reboot you still get the same error please post the output from journalctl -b0 | grep -i zfs -C 2

mstefan · Jan 3, 2022

I enabled zfs-import for my datapool and deactivated cache for the pool with zpool set cachefile=none datapool.
Now after a reboot no error occurs.

But i really do not understand what the problem is/was.
This solution just changes the behavior to search for pools (in my understanding)

I will go on trying. Maybe a cachefile per pool makes a difference?

Thanks for your help!

// Edit:
Just for all those who find this thread, I just re-disabled the import for my datapool with systemctl disable zfs-import@POOLNAME and changed the cachefile of this pool to a seperate one /etc/zfs/zpool2.cache with zpool set cachefile=/etc/zfs/zpool2.cache <YOURPOOL>.

No error after reboot.

I will have a closer look in the future if this error will come back again.

oguz · Jan 4, 2022

mstefan said:
Now after a reboot no error occurs.

great!! also thanks for trying the alternative solution as well

please mark the thread [SOLVED] if the error doesn't occur anymore, so others can know what to expect

madmaximus · May 20, 2022

Have the same problem with a second pool named BIG for HDD pool
But have some issues with errors on the pool status
Sorry i don't have zpool status before the manip.
I do the same manipulation as you "set a second cache for this pool" but steal have errors on the pool for one disk (all the disk are new 200hours)
PVE1_BIG ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-WDC_WD40EFZX-68AWUN0_WD-WX82DA112DD4 ONLINE 1 1 0
ata-WDC_WD40EFZX-68AWUN0_WD-WX12DC128V96 ONLINE 0 0 0
ata-WDC_WD40EFZX-68AWUN0_WD-WX52DB45EVT4 ONLINE 0 0 0
Maybe the first disk id defective but does anybody knows what is the consequences of using the same cache for 2 different pools ?

madmaximus · May 20, 2022

Well it seems the set cachefile doesn't resist to a reboot

Code:

PVE1_BIG  cachefile                      none                           local

The service is already inactive

Code:

systemctl status zfs-import@PVE1_BIG

zfs-import@PVE1_BIG.service - Import ZFS pool PVE1_BIG

     Loaded: loaded (/lib/systemd/system/zfs-import@.service; disabled; vendor preset: enabled)

     Active: inactive (dead)

       Docs: man:zpool(8)

HomemadeAdvanced · Jan 8, 2023

Hello there,

i would like to submit my solution to the error message while booting: "[FAILED] Failed to start Import ZFS pool YOURRAID". I found out that there was a symlink from /etc/systemd/system/zfs-import.target.wants/zfs-import@YOURRAID.service to /lib/systemd/system/zfs-import@.service and I think that this causes the notification on startup at least at my system.
Step by Step Solution:
1. Navigate to the folder cd /etc/systemd/system/zfs-import.target.wants/
2. Search for your file with ls
3. Take your file and look up the symlink ls -l zfs-import@YOURRAID.service
4. Unlink the symlink unlink zfs-import@YOURRAID.service
5. reboot

Since then I have no error messages during reboot.
Helping website https://markontech.com/linux/create-symlinks-in-linux/

hk@ · Feb 11, 2023

it seems we got the same issue on proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
could someone clarify if there is simply one cache not suitable for more than one zfs storage the issue or some other issue at work?

mati · Mar 9, 2023

Same error here. Fresh Proxmox install 7.3, but the error doesn't seem to cause any issues.

fabio@80 · Mar 18, 2023

HomemadeAdvanced said:
Ciao,

vorrei inviare la mia soluzione al messaggio di errore durante l'avvio: "[ FAILED ] Impossibile iniziare Import ZFS pool YOURRAID". Ho scoperto che c'era un collegamento simbolico da /etc/systemd/system/zfs-import.target.wants/ zfs-import@YOURRAID.service a /lib/systemd/system/zfs-import@.service e penso che questo provoca la notifica all'avvio almeno sul mio sistema.
Soluzione passo dopo passo:
1. Passare alla cartella cd /etc/systemd/system/zfs-import.target.wants/
2. Cerca il tuo file con ls
3. Prendi il tuo file e cerca il collegamento simbolico ls -l zfs-import@YOURRAID.service
4. Scollega il collegamento simbolico unlink zfs-import@YOURRAID.service
5. riavvia

Da allora non ho messaggi di errore durante il riavvio.
Sito Web di aiuto https://markontech.com/linux/create-symlinks-in-linux/

Grazie mille, ho risolto questo problema grazie al tuo post. Tutto funziona perfettamente

fabio@80 · Mar 18, 2023

Grazie mille, ho risolto questo problema grazie al tuo post. Tutto funziona perfettamente.

HomemadeAdvanced · Mar 21, 2023

hk@ said:
it seems we got the same issue on proxmox-ve: 7.3-1 (running kernel: 5.15.85-1-pve)
could someone clarify if there is simply one cache not suitable for more than one zfs storage the issue or some other issue at work?

I also think it has something to do with the none used cached. Maybe it is necessary to give the zfs raid a cache.

HomemadeAdvanced · Mar 21, 2023

fabio@80 said:
Grazie mille, ho risolto questo problema grazie al tuo post. Tutto funziona perfettamente.

Is the jpg image made before or after the failure was dealed with?

uihuizui · Apr 12, 2023

mstefan said:
Just for all those who find this thread, I just re-disabled the import for my datapool with systemctl disable zfs-import@POOLNAME and changed the cachefile of this pool to a seperate one /etc/zfs/zpool2.cache with zpool set cachefile=/etc/zfs/zpool2.cache <YOURPOOL>.

No error after reboot.

I will have a closer look in the future if this error will come back again.

Same here on a fresh 7.4 install. The commands solved the isse.

James BOD · Apr 12, 2023

Hi all,

just had the same problem and went the path that @HomemadeAdvanced had gone.

I was moving a ZFS-Mirror to another installation consisting of one zpool.

I didn't delete or unlink the the file as described but had a look inside the linked service description. It contained something like

Code:

ExecStart=/sbin/zpool import -aN -d /dev/disk/by-id -o cachefile=none $ZPOOL_IMPORT_OPTS

When I executed

Code:

/sbin/zpool import -aN

i got the error, it wasn't able to import the zpool.

A simple

Code:

/sbin/zpool import -faN (additional -f)

imported the zpool. That solved my problem.
The aforementioned solution may have triggered the same behaviour.
I hope it is helpful for someone.

All the best and thanks to all.

lwalper · Jun 22, 2023

Code:

/sbin/zpool import -faN

fixed mine

gbas · Sep 25, 2023

With me it appeared to be a time-out issue with my brand new NVME SSD and it's brand new 'Maxiotek 1602' Controller icw installation of Proxmox 8.0 and the linux 6.1 kernel. Sometimes it booted, sometimes not, no matter what delay I put where. Then I found the disk was not available at all in /dev during failures.

https://www.linux.org/threads/lexar-nm790-nvme-fails-to-initialize.46315/
https://lore.kernel.org/lkml/7cd693dd-a6d7-4aab-aef0-76a8366ceee6@archlinux.org/T/
https://www.reddit.com/r/archlinux/comments/15xbxeo/nvme_device_not_ready_aborting_initialisation/

It was introduced in a 6.1.x kernel (read it somewhere but could not find the exact version while writing this message). And hopefully it is fixed in the 6.5 version.
For now reverted back to Proxmox 7.4 with kernel 5.15 and will wait for a version with fixed kernel.

fiona · Sep 26, 2023

Hi,

gbas said:
With me it appeared to be a time-out issue with my brand new NVME SSD and it's brand new 'Maxiotek 1602' Controller icw installation of Proxmox 8.0 and the linux 6.1 kernel. Sometimes it booted, sometimes not, no matter what delay I put where. Then I found the disk was not available at all in /dev during failures.

https://www.linux.org/threads/lexar-nm790-nvme-fails-to-initialize.46315/
https://lore.kernel.org/lkml/7cd693dd-a6d7-4aab-aef0-76a8366ceee6@archlinux.org/T/
https://www.reddit.com/r/archlinux/comments/15xbxeo/nvme_device_not_ready_aborting_initialisation/

It was introduced in a 6.1.x kernel (read it somewhere but could not find the exact version while writing this message). And hopefully it is fixed in the 6.5 version.
For now reverted back to Proxmox 7.4 with kernel 5.15 and will wait for a version with fixed kernel.

View attachment 55782

might be the same issue as mentioned here: https://forum.proxmox.com/threads/128738/post-588785
Do you also see the Device not ready; aborting initialisation, CSTS=0x0 message?

[SOLVED] Second ZFS pool failed to import on boot

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

Member

Proxmox Retired Staff

New Member

New Member

New Member

Renowned Member

New Member

New Member

New Member

Attachments

New Member

New Member

New Member

New Member

New Member

New Member

Proxmox Staff Member