ZFS cachefile settings seems to change after reboot and import services return errors.

Mar 3, 2024
1
0
1
Hi everyone,

I have 3 ZFS pools:
- pond (mirror pool)
- tyler (single disk pool)
- backup (another single disk pool)

They all show up fine during normal usage, but I noticed errors when booting.
For example, the last boot both pond and backup failed to import.

Code:
# systemctl status zfs-import@pond.service
× zfs-import@pond.service - Import ZFS pool pond
     Loaded: loaded (/lib/systemd/system/zfs-import@.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Sun 2024-03-03 17:16:21 CET; 13min ago
       Docs: man:zpool(8)
    Process: 612 ExecStart=/sbin/zpool import -N -d /dev/disk/by-id -o cachefile=none pond (code=exited, status=1/FAILURE)
   Main PID: 612 (code=exited, status=1/FAILURE)
        CPU: 13ms

Mar 03 17:16:20 gen10-proxmox systemd[1]: Starting zfs-import@pond.service - Import ZFS pool pond...
Mar 03 17:16:21 gen10-proxmox zpool[612]: cannot import 'pond': no such pool available
Mar 03 17:16:21 gen10-proxmox systemd[1]: zfs-import@pond.service: Main process exited, code=exited, status=1/FAILURE
Mar 03 17:16:21 gen10-proxmox systemd[1]: zfs-import@pond.service: Failed with result 'exit-code'.
Mar 03 17:16:21 gen10-proxmox systemd[1]: Failed to start zfs-import@pond.service - Import ZFS pool pond.

I guess they got imported by the import-cache service because I followed the instructions here.

Code:
~# systemctl status zfs-import-cache.service
● zfs-import-cache.service - Import ZFS pools by cache file
     Loaded: loaded (/lib/systemd/system/zfs-import-cache.service; enabled; preset: enabled)
     Active: active (exited) since Sun 2024-03-03 17:16:35 CET; 14min ago
       Docs: man:zpool(8)
    Process: 610 ExecStart=/sbin/zpool import -c /etc/zfs/zpool.cache -aN $ZPOOL_IMPORT_OPTS (code=exited, status=0/SUCCESS)
   Main PID: 610 (code=exited, status=0/SUCCESS)
        CPU: 46ms

Mar 03 17:16:20 gen10-proxmox systemd[1]: Starting zfs-import-cache.service - Import ZFS pools by cache file...
Mar 03 17:16:35 gen10-proxmox zpool[610]: cannot import 'tyler': pool already exists
Mar 03 17:16:35 gen10-proxmox zpool[610]: no pools available to import
Mar 03 17:16:35 gen10-proxmox zpool[610]: cachefile import failed, retrying
Mar 03 17:16:35 gen10-proxmox systemd[1]: Finished zfs-import-cache.service - Import ZFS pools by cache file.

The output of zpool get cachefile is:

Code:
# zpool get cachefile
NAME    PROPERTY   VALUE      SOURCE
backup  cachefile  -          default
pond    cachefile  -          default
tyler   cachefile  none       local

So what's happening here? Does the zfs-import-cache service import the pool before the zfs-import@POOL_NAME.service has a chance to do it? They all seem to return an error at boot at random intervals (for a couple boots only the "backup" pool was erroring out).
What's the correct way of doing this? Should I regenerate the cachefile and then disable all the zfs-import@POOL_NAME.service services?
How can I check that the cachefile is actually set up correctly?

Thank you