Failed to start Import ZFS pool

Vynn

New Member
May 11, 2022
15
0
1
Hi,

at startup of two fresh installed Proxmox nodes (v7.2.1) i get an error that a ZFS pool could not be imported.

2022-05-11-11_17_00-Remote-KVM-2.png

PVE_ZFS_POOL is the name of the pool. Proxmox itself is installed on a hardware raid and not on ZFS. This is a ZFS storage pool.

Proxmox starts up after that message and seems to work. Is this error message something i need to care about? What is the reason for this error?

Thank you so much for help!
 
hi,

at startup of two fresh installed Proxmox nodes (v7.2.1) i get an error that a ZFS pool could not be imported.
could be that your ZFS import cache is corrupted or has old values in it.

Code:
zpool set cachefile=/etc/zfs/zpool.cache YOUR_POOLNAME
update-initramfs -k all -u
reboot

hope this helps
 
  • Like
Reactions: crc-error-79
Thank you very much for the fast reply.

This could not solve it. I have this error message on two new installed nodes, so it is rather a basic problem in the boot process. After startup the pool is available and online.

Is it safe to ignore this message or any other idea what could be the reason for it?
 
This could not solve it.
have you replaced YOUR_POOLNAME with PVE_ZFS_POOL?

Is it safe to ignore this message
if you're trying to start VMs on the pool automatically at boot, it could cause issues (the volumes wouldn't be immediately available for example).
otherwise if the pool is available after startup, it should be safe to ignore.

or any other idea what could be the reason for it?
you can check your ZFS pool import services: systemctl | grep zfs-import

you should see two services: zfs-import and zfs-import-cache. normally zfs-import-cache is activated (was the reason my guess about the cachefile).

you can unmask the zfs-import service and use that one instead of the cachefile as a temporary fix, but that one goes through the devices manually without any caching so it might take longer to boot as well.
 
have you replaced YOUR_POOLNAME with PVE_ZFS_POOL?
of course ;)

if you're trying to start VMs on the pool automatically at boot, it could cause issues (the volumes wouldn't be immediately available for example).
This is a very important requirement. The two nodes shoud run as a cluster with a third qourum device and two identical ZFS pools on both nodes will be used for replication and automatic failover. This is working well.

I didn't check the behavior of automatic boot of VMs at start up. So far there are no VMs on the nodes. I will check this.

Here is some more information

Bash:
systemctl | grep zfs-import
  zfs-import-cache.service                             loaded active     exited    Import ZFS pools by cache file
● zfs-import@PVE_ZFS_POOL.service         loaded failed     failed    Import ZFS pool PVE_ZFS_POOL
  zfs-import.target                                          loaded active     active    ZFS pool import target

Bash:
systemctl status zfs-import
● zfs-import.service
     Loaded: masked (Reason: Unit zfs-import.service is masked.)
     Active: inactive (dead)


Unmasking zfs-import does not change anything as well
Bash:
systemctl unmask zfs-import

This command didn't give me any confirmation, so i removed it manually

Bash:
ls -l zfs-import*
-rw-r--r-- 1 root root 519 Mar 24 09:28 zfs-import-cache.service
-rw-r--r-- 1 root root 505 Mar 24 09:28 zfs-import-scan.service
lrwxrwxrwx 1 root root   9 Mar 24 09:28 zfs-import.service -> /dev/null
-rw-r--r-- 1 root root 349 Mar 24 09:28 zfs-import@.service
-rw-r--r-- 1 root root 101 Mar 24 09:28 zfs-import.target

rm zfs-import.service

systemctl daemon-reload

systemctl status zfs-import

After reboot

Bash:
systemctl status zfs-import
Unit zfs-import.service could not be found.

Anyhow the error message remains the same.

I will check if this affects the automatic boot process. However, this error leaves me with a bad feeling. Any other idea what else i could do?
 
my bad, it should be zfs-import-scan.service that you can enable with systemctl (that one goes through the devices manually without any caching so it might take longer to boot as well)

you can also check journalctl -xe zfs-import@PVE_ZFS_POOL.service for possible hints as to why it failed.
 
Bash:
journalctl -xe zfs-import@PVE_ZFS_POOL.service
Failed to add match 'zfs-import@PVE_ZFS_POOL.service': Invalid argument

Bash:
systemctl status zfs-import-scan
● zfs-import-scan.service - Import ZFS pools by device scanning
     Loaded: loaded (/lib/systemd/system/zfs-import-scan.service; enabled; vendor preset: disabled)
     Active: inactive (dead)
  Condition: start condition failed at Wed 2022-05-11 13:29:32 CEST; 25min ago
             └─ ConditionFileNotEmpty=!/etc/zfs/zpool.cache was not met
       Docs: man:zpool(8)

I removed the cache from zfs-import-scan

Bash:
vi zfs-import-scan.service

#ConditionFileNotEmpty=!/etc/zfs/zpool.cache

After reboot the status of zfs-import-scan is active

Bash:
systemctl status zfs-import-scan
● zfs-import-scan.service - Import ZFS pools by device scanning
     Loaded: loaded (/lib/systemd/system/zfs-import-scan.service; enabled; vendor preset: disabled)
     Active: active (exited) since Wed 2022-05-11 14:04:33 CEST; 56s ago
       Docs: man:zpool(8)
    Process: 858 ExecStart=/sbin/zpool import -aN -d /dev/disk/by-id -o cachefile=none $ZPOOL_IMPORT_OPTS (code=exited, status=0/SUCCESS)
   Main PID: 858 (code=exited, status=0/SUCCESS)
        CPU: 16ms

But problem is still the same

Bash:
journalctl -xe zfs-import@PVE_ZFS_POOL.service
Failed to add match 'zfs-import@PVE_ZFS_POOL.service': Invalid argument

and it still shows the error message "Failed to start import ZFS pool"
 
After reboot the status of zfs-import-scan is active But problem is still the same
did you also disable zfs-import-cache?

Invalid argument
sorry, should be just journalctl -r | grep zfs-import or journalctl -u service_name
 
yes, zfs-import-cache is disabled

Bash:
journalctl -r | grep zfs-import
May 11 14:25:08 PVE-01 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Failed with result 'exit-code'.
May 11 14:25:08 PVE-01 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Main process exited, code=exited, status=1/FAILURE
May 11 14:25:05 PVE-01 udevadm[642]: systemd-udev-settle.service is deprecated. Please fix zfs-import-scan.service not to pull it in.

Bash:
journalctl -u zfs-import-cache
May 11 14:22:29 PVE-01 systemd[1]: zfs-import-cache.service: Succeeded.
May 11 14:22:29 PVE-01 systemd[1]: Stopped Import ZFS pools by cache file.

Bash:
systemctl status zfs-import-cache
● zfs-import-cache.service - Import ZFS pools by cache file
     Loaded: loaded (/lib/systemd/system/zfs-import-cache.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:zpool(8)

Bash:
systemctl status zfs-import
● zfs-import.service
     Loaded: masked (Reason: Unit zfs-import.service is masked.)
     Active: inactive (dead)
 
could you post the full journal here?

run journalctl -b0 > journal.txt after a boot with the error, and post/attach the resulting journal.txt file.

removing the zfs-import service was not necessary, sorry if i caused confusion for you.
 
Here is the journal. Hopefully you find something. Thank you very much for looking into it.

zfs-import is a masked service. The pool has its own zfs-import service file.

Bash:
systemctl status zfs-import@PVE_ZFS_POOL.service
● zfs-import@PVE_ZFS_POOL.service - Import ZFS pool PVE_ZFS_POOL
     Loaded: loaded (/lib/systemd/system/zfs-import@.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2022-05-12 01:56:30 CEST; 15s ago
       Docs: man:zpool(8)
    Process: 3834 ExecStart=/sbin/zpool import -N -d /dev/disk/by-id -o cachefile=none PVE_ZFS_POOL (code=exited, status=1/FAILURE)
   Main PID: 3834 (code=exited, status=1/FAILURE)
        CPU: 17ms

May 12 01:56:27 PVE-02 systemd[1]: Starting Import ZFS pool PVE_ZFS_POOL...
May 12 01:56:30 PVE-02 zpool[3834]: cannot import 'PVE_ZFS_POOL': a pool with that name already exists
May 12 01:56:30 PVE-02 zpool[3834]: use the form 'zpool import <pool | id> <newpool>' to give it a new name
May 12 01:56:30 PVE-02 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Main process exited, code=exited, status=1/FAILURE
May 12 01:56:30 PVE-02 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Failed with result 'exit-code'.
May 12 01:56:30 PVE-02 systemd[1]: Failed to start Import ZFS pool PVE_ZFS_POOL.
 

Attachments

  • journal.txt
    140.6 KB · Views: 7
I started from scratch today and reinstalled the nodes. The first boot after the installation also shows a ZFS error. There are no ZFS volumes defined so far.

2022-05-12 11_09_51-Remote KVM.png


No-ZFS.png

It seems to be rather a general problem?
 
I did another reinstall in the meantime. With the former install was no ZFS pool defined, but there were ZFS reserved Paritions on two SSD left.

No-ZFS2.png

I removed these partitions with fdisk and did another reinstall. Now the error message is gone.

Only the ZFS partition will cause this error. When i create a ZFS pool now the error is back.
 
ZFS Pool created, error message is back

Bash:
journalctl -b0 |grep ZFS_POO
May 12 15:50:28 PVE-02 systemd[1]: Starting Import ZFS pool PVE_ZFS_POOL...
May 12 15:50:28 PVE-02 zpool[867]: cannot import 'PVE_ZFS_POOL': no such pool available
May 12 15:50:28 PVE-02 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Main process exited, code=exited, status=1/FAILURE
May 12 15:50:28 PVE-02 systemd[1]: zfs-import@PVE_ZFS_POOL.service: Failed with result 'exit-code'.
May 12 15:50:28 PVE-02 systemd[1]: Failed to start Import ZFS pool PVE_ZFS_POOL.
May 12 15:50:29 PVE-02 zed[1318]: eid=5 class=config_sync pool='PVE_ZFS_POOL'
May 12 15:50:29 PVE-02 zed[1315]: eid=2 class=config_sync pool='PVE_ZFS_POOL'
May 12 15:50:29 PVE-02 zed[1316]: eid=3 class=pool_import pool='PVE_ZFS_POOL'

Any other idea how to fix that?
Could you find something in the journal file?
 
I have a similar error that I just can't get rid of.
1) my pool RZ2-1_1-4_4TB is reporting online and it is working well
2) the pool wd_7disks_rz2 is not existing since a long time.
In both cases I get the [FAILED] Failed to start Import ZFS pool poolname at each boot.
How can I correct this ?

Can I just delete both :
/etc/systemd/system/zfs-import.target.wants/zfs-import@RZ2\x2d1_1\x2d4_4TB.service
/etc/systemd/system/zfs-import.target.wants/zfs-import@wd_7disks_rz2.service

Should I keep /etc/systemd/system/zfs-import.target.wants/zfs-import@RZ2\x2d1_1\x2d4_4TB.service
because I get that error because some vm are boot from this pool at the host boot ?

zpool list -v NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT R1_1.6TB_SSD_EVO860 1.62T 213G 1.42T - - 29% 12% 1.00x ONLINE - mirror-0 1.62T 213G 1.42T - - 29% 12.8% - ONLINE ata-Samsung_SSD_860_EVO_2TB_S597NJ0NB19827A - - - - - - - - ONLINE ata-Samsung_SSD_860_EVO_2TB_S597NJ0NB19834W - - - - - - - - ONLINE RZ2-1_1-4_4TB 14.5T 1.14T 13.4T - - 1% 7% 1.00x ONLINE - raidz2-0 14.5T 1.14T 13.4T - - 1% 7.83% - ONLINE ata-WDC_WD40EFAX-68JH4N1_WD-WXB2D718V32S - - - - - - - - ONLINE ata-WDC_WD40EFAX-68JH4N1_WD-WXB2D718V5PK - - - - - - - - ONLINE ata-WDC_WD40EFAX-68JH4N1_WD-WXB2D71JYVUJ - - - - - - - - ONLINE ata-WDC_WD40EFAX-68JH4N1_WD-WXB2D718VANJ - - - - - - - - ONLINE RZ2-2_5-8_2TB 7.27T 6.63T 654G - - 54% 91% 1.00x ONLINE - raidz2-0 7.27T 6.63T 654G - - 54% 91.2% - ONLINE ata-WDC_WD2000FYYZ-01UL1B0_WD-WCC1P0117476 - - - - - - - - ONLINE ata-WDC_WD2000FYYZ-01UL1B0_WD-WCC1P0127826 - - - - - - - - ONLINE ata-WDC_WD2000FYYZ-01UL1B0_WD-WCC1P0080071 - - - - - - - - ONLINE ata-WDC_WD2000FYYZ-01UL1B0_WD-WCC1P0117388 - - - - - - - - ONLINE cat /var/log/syslog |grep zfs-import May 12 10:35:40 pve systemd[1]: zfs-import@RZ2\x2d2_5\x2d8_2TB.service: Succeeded. May 12 10:38:52 pve udevadm[610]: systemd-udev-settle.service is deprecated. Please fix zfs-import-scan.service, zfs-import-cache.service not to pull it in. May 12 10:38:52 pve systemd[1]: zfs-import@RZ2\x2d1_1\x2d4_4TB.service: Main process exited, code=exited, status=1/FAILURE May 12 10:38:52 pve systemd[1]: zfs-import@RZ2\x2d1_1\x2d4_4TB.service: Failed with result 'exit-code'. May 12 10:38:52 pve systemd[1]: zfs-import@wd_7disks_rz2.service: Main process exited, code=exited, status=1/FAILURE May 12 10:38:52 pve systemd[1]: zfs-import@wd_7disks_rz2.service: Failed with result 'exit-code'. May 12 11:46:25 pve systemd[1]: zfs-import@RZ2\x2d2_5\x2d8_2TB.service: Succeeded. May 12 11:46:25 pve systemd[1]: zfs-import@RZ2\x2d2_5\x2d8_2TB.service: Consumed 1.403s CPU time. May 12 11:49:32 pve udevadm[610]: systemd-udev-settle.service is deprecated. Please fix zfs-import-scan.service, zfs-import-cache.service not to pull it in. May 12 11:49:32 pve systemd[1]: zfs-import@RZ2\x2d1_1\x2d4_4TB.service: Main process exited, code=exited, status=1/FAILURE May 12 11:49:32 pve systemd[1]: zfs-import@RZ2\x2d1_1\x2d4_4TB.service: Failed with result 'exit-code'. May 12 11:49:32 pve systemd[1]: zfs-import@wd_7disks_rz2.service: Main process exited, code=exited, status=1/FAILURE May 12 11:49:32 pve systemd[1]: zfs-import@wd_7disks_rz2.service: Failed with result 'exit-code'.
 
Last edited:
There is still no solution for this problem. However, i decided to life with this error message. The cluster is running fine without any errors. It seems the error message does not have any effect on the system.
 
I too have the same exact "Failed to start Import ZFS pool backup1" message on one of my pools. I was wondering for the longest why that particular pool was no longer showing up as a smb share like my other pools I have set up to do so, until recently i decided to observe the boot process and saw the message, and figured it had somthing to do with that. The pool functions properly, only thing is I have to issue "zfs set sharesmb=on backup1" after any reboot of the host to get it to show up as a smb share. Also this is a single disk pool and about 6 months ago I copied all the data to a larger disk and tried to delete and/or rename the original "backup1" pool to something else so I can name the larger disk backup1 but it didn't go as planned. All pools function properly, just the one named backup1 gives that error on boot, even if i delete the pool entirely and re-create it with the same name.
 
I used VMware Workstation 16 Player to create a PVE data using ZFS is the same problem but does not affect the function
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!