Problem importing zfs pool

Chamonix73

New Member
Mar 25, 2024
13
0
1
Hi

I am running PBS as a VM and I got 2 hdds in Mirror configuration.
At boot the host starts up as normal but the PBS VM hangs at "importing pools" when loading so I cannot SSH to it or communicate via the GUI.

At the host the pool is not listed but the disks are OK and I can reach it through the GUI.
"zpool status" gives me nothing, "zfs import -f "poolname" freezes or it hangs but "zfs import" I get the following output whern trying to import manually:

zpool import
pool: Backups
id: 2818162679070632605
state: ONLINE
status: Some supported features are not enabled on the pool.
(Note that they may be intentionally disabled if the
'compatibility' property is set.)
action: The pool can be imported using its name or numeric identifier, though
some features will not be available without an explicit 'zpool upgrade'.
config:

Backups ONLINE
mirror-0 ONLINE
sdb ONLINE
sda ONLINE
root@pve5:~# zpool import -f -d /dev/disk/by-id/2818162679070632605

no pools available to import
root@pve5:~# zpool import -f -d /dev/disk/by-id/Backups
no pools available to import
 
Last edited:
Please describe your setup in more detail. What disk(s) and datastore does PBS use? How are they connected to the PBS?
 
Please describe your setup in more detail. What disk(s) and datastore does PBS use? How are they connected to the PBS?
Hi and thx for the reply.

My backup machine is a weak 2core ITX mobo running proxmox with PBS virtualized on it. It has not cause me any trouble during almost 3 years. It activates at certain times to perform backup tasks and in between stays suspended.

The array are a mirror of 2 dedicated 1tb disks (sda and sdb) in a mirror as shown in first message, passthroughed and dedicated to PBS-storage with nothing else on them.
I have never had a problem like this before and as far as I understand since the PVE host cannot mount the zfs pool that's why the PBS VM does not start.
Seems like something is causing the pool to take ages to mount and/or it does not mount for some reason.
 
Last edited:
Was the mirror zpool created in the PVE (hypervisor) or in the PBS (VM)?
If in the VM, then the zpool should NOT be mounted in the PVE. That's the way passing-through works: the hypervisor doesn't use / mount / touch / mess ;-) the device.

Note that I don't know if the opposite (creating the zpool in the hypervisor) and then passing it through to the VM is possible... That's why I'm asking where it was created.
 
On the import cmd's
/dev/disk/by-id/2818162679070632605 needs to be /dev/disk/by-id/ 2818162679070632605
and/or /dev/disk/by-id/Backups should be /dev/disk/by-id/ Backups
 
Thx so much for replies.

Onslow: this got me wondering since it's been working flawlessly since install.

Stupid me may have tried mounting the zfs pool from the PVE, that is what's causing the PBS VM problem now, not being able to mount the zfs pool from within VM and not booting the PBS VM.
I remembering a message saying " was previously mounted from another system" or something when was trying last week.
Is there any way I can reverse this? Any way I can force the PBS VM to boot? I cannot SSH to it.
BUT, I have always been able to see the ZFS pool from PVE. Would that have been the case if it was created inside VM only?

On the PVE, journalctl is reporting "PBS: error fetching datastores - 500 Can't connect to 192.168.10.116:8007 (No route to host)" and that's the name and IP of the PBS VM (it cannot boot).
Task history shows nothing else than start/stop etc for the VM.

The PBS is hanging at this errormessage at boot now and I dunno how to get past it sicne the VM is not up:
 

Attachments

  • ksnip_20251014-141735.jpg
    ksnip_20251014-141735.jpg
    10.3 KB · Views: 7
Last edited:
At the PVE command line:

Code:
zpool list
zpool status
zfs list

and post the results in the CODE tags (not screenshots) :)
 
At the PVE command line:

Code:
zpool list
zpool status
zfs list

and post the results in the CODE tags (not screenshots) :)
Thanks.
Just posted screenshot since it was from teh PBS VM which cannot boot to post anything.
Tried all those commands on PVE before my post. Gives nothing.

Code:
root@pve5:~# zpool list
no pools available
root@pve5:~# zpool status
no pools available
root@pve5:~# zfs list
no datasets available
 
Thanks for support and reply!

Code:
root@pve5:~# lsblk
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda                            8:0    0 931.5G  0 disk
├─sda1                         8:1    0 931.5G  0 part
└─sda9                         8:9    0     8M  0 part
sdb                            8:16   0 931.5G  0 disk
├─sdb1                         8:17   0 931.5G  0 part
└─sdb9                         8:25   0     8M  0 part
sdc                            8:32   0 111.8G  0 disk
├─sdc1                         8:33   0  1007K  0 part
├─sdc2                         8:34   0     1G  0 part
└─sdc3                         8:35   0 110.8G  0 part
  ├─pve-swap                 252:0    0     4G  0 lvm  [SWAP]
  ├─pve-root                 252:1    0  38.7G  0 lvm  /
  ├─pve-data_tmeta           252:2    0     1G  0 lvm
  │ └─pve-data-tpool         252:4    0  52.3G  0 lvm
  │   ├─pve-data             252:5    0  52.3G  1 lvm
  │   └─pve-vm--103--disk--0 252:6    0    32G  0 lvm
  └─pve-data_tdata           252:3    0  52.3G  0 lvm
    └─pve-data-tpool         252:4    0  52.3G  0 lvm
      ├─pve-data             252:5    0  52.3G  1 lvm
      └─pve-vm--103--disk--0 252:6    0    32G  0 lvm
zram0                        251:0    0     2G  0 disk [SWAP]

and:

Code:
root@pve5:~# df -hT
Filesystem           Type      Size  Used Avail Use% Mounted on
udev                 devtmpfs  3.8G     0  3.8G   0% /dev
tmpfs                tmpfs     775M  2.3M  772M   1% /run
/dev/mapper/pve-root ext4       38G   13G   24G  34% /
tmpfs                tmpfs     3.8G   34M  3.8G   1% /dev/shm
tmpfs                tmpfs     5.0M     0  5.0M   0% /run/lock
/dev/fuse            fuse      128M   32K  128M   1% /etc/pve
tmpfs                tmpfs     775M     0  775M   0% /run/user/0
 
You're welcome! I haven't really helped much so far ;-(.

This
sda1, sda9 ; sdb1, sdb9 quite confirms that these disks contains ZFS. The layout is quite characteristic, I have the same and I also pass-through a ZFS disk to a VM.
If you would additionally issue blkid , you probably would get TYPE=zfs_member in sda1, sdb1.

FWIW: I have checked my installation notes and I have found some remarks (which probably will not help in your situation, but just in case...).

Back then I noticed that after passing-through a disk to a VM and installing there PBS and creating a ZFS datastore, in my PVE I got "State: degraded" in systemctl status . It was because zfs-import-scan.service tried to import the ZFS pool from that disk (but of course failed with "pool was previously in use from another system").
That did NOT disturbed PBS VM though. So probably it's not the reason of your issue.

Just for cosmetics, I did in the PVE: systemctl disable --now zfs-import-scan

But, as I already said, it's probably not the reason in your system, especially that your setup has worked OK for years.

Back to an error shown in your (a little fuzzy) screenshot (failed to start zfs-import...).

I've searched Google and have found, among others, these threads: https://forum.proxmox.com/threads/proxmox-failed-to-start-import-zfs-pool-pool-name.117379/ |||| https://forum.proxmox.com/threads/second-zfs-pool-failed-to-import-on-boot.102409/

Have a look at them, especially messages from Proxmox Staff Members and ...Retired Staff.

Maybe your issue comes from zpool cache file?

But you wrote that the VM hangs during a start. So I understand that you can't easily make changes in it... Maybe one should boot it somehow, e.g. in rescue mode?...
 
You're welcome! I haven't really helped much so far ;-(.

This
sda1, sda9 ; sdb1, sdb9 quite confirms that these disks contains ZFS. The layout is quite characteristic, I have the same and I also pass-through a ZFS disk to a VM.
If you would additionally issue blkid , you probably would get TYPE=zfs_member in sda1, sdb1.
Late answer. I had some really bad weather passing so I turned off my servers to prevent powerloss.
That is what I am getting, yes.
FWIW: I have checked my installation notes and I have found some remarks (which probably will not help in your situation, but just in case...).

Back then I noticed that after passing-through a disk to a VM and installing there PBS and creating a ZFS datastore, in my PVE I got "State: degraded" in systemctl status . It was because zfs-import-scan.service tried to import the ZFS pool from that disk (but of course failed with "pool was previously in use from another system").
That did NOT disturbed PBS VM though. So probably it's not the reason of your issue.
I am getting the same. The issue started last week but the date indicate last restart.
Code:
● pve5
    State: degraded
    Units: 426 loaded (incl. loaded aliases)
     Jobs: 0 queued
   Failed: 4 units
    Since: Tue 2025-10-14 11:31:15 CEST; 1 day 2h ago
Just for cosmetics, I did in the PVE: systemctl disable --now zfs-import-scan

But, as I already said, it's probably not the reason in your system, especially that your setup has worked OK for years.

Back to an error shown in your (a little fuzzy) screenshot (failed to start zfs-import...).

I've searched Google and have found, among others, these threads: https://forum.proxmox.com/threads/proxmox-failed-to-start-import-zfs-pool-pool-name.117379/ |||| https://forum.proxmox.com/threads/second-zfs-pool-failed-to-import-on-boot.102409/

Have a look at them, especially messages from Proxmox Staff Members and ...Retired Staff.

Maybe your issue comes from zpool cache file?

But you wrote that the VM hangs during a start. So I understand that you can't easily make changes in it... Maybe one should boot it somehow, e.g. in rescue mode?...
Yes. I have tried to boot the PBS VM in recovery mode but it won't accept the root password which is wierd to say the least.
I have also read thos earticles before but since I cannot get into SSH or a correct CLi on that VM I dunno how to issue commands.

I also found this error in the PVE logs

Code:
× zfs-import-cache.service - Import ZFS pools by cache file
     Loaded: loaded (/lib/systemd/system/zfs-import-cache.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2025-10-16 13:55:39 CEST; 21h ago
       Docs: man:zpool(8)
    Process: 664 ExecStart=/sbin/zpool import -c /etc/zfs/zpool.cache -aN $ZPOOL_IMPORT_OPTS (code=exited, status=1/FAILURE)
   Main PID: 664 (code=exited, status=1/FAILURE)
        CPU: 39ms

Oct 16 13:55:40 pve5 zpool[664]: Last accessed by pbs (hostid=53b32436) at Tue Oct  7 11:13:09 2025
Oct 16 13:55:40 pve5 zpool[664]: The pool can be imported, use 'zpool import -f' to import the pool.
Oct 16 13:55:40 pve5 zpool[664]: cannot import 'Backups': pool was previously in use from another system.
Oct 16 13:55:40 pve5 zpool[664]: Last accessed by pbs (hostid=53b32436) at Tue Oct  7 11:13:09 2025
Oct 16 13:55:40 pve5 zpool[664]: The pool can be imported, use 'zpool import -f' to import the pool.
Oct 16 13:55:40 pve5 zpool[664]: cachefile import failed, retrying
Oct 16 13:55:37 pve5 systemd[1]: Starting zfs-import-cache.service - Import ZFS pools by cache file...
Oct 16 13:55:39 pve5 systemd[1]: zfs-import-cache.service: Main process exited, code=exited, status=1/FAILURE
Oct 16 13:55:39 pve5 systemd[1]: zfs-import-cache.service: Failed with result 'exit-code'.
Oct 16 13:55:39 pve5 systemd[1]: Failed to start zfs-import-cache.service - Import ZFS pools by cache file.
 
Last edited:
About the last error, you can try in the PVE: systemctl disable --now zfs-import-scan
as I mentioned above. PVE has no interest in the passed-through disk, anyway.

About recovery mode: if you don't succeed, maybe boot the VM from some "live CD" image.
 
Have been away on a small trip
I was finally able to boot into recovery mode on the PBS VM. All Zpool commands shows nothing.
I cannot copy from the console so I posted pictures.
lsblk shows the following (the individual members of the pool is shown but pool not available:

ksnip_20251021-231025.jpg
df -Ht:

ksnip_20251021-231130.jpg
I have also tried
Code:
zfs list (no pools available)

zpool status (no pools available)

zfs list (no datasets available)

zpool clear Backups (clear errors - hangs the server)

zpool import -f (hangs the server)

Seems verys trange that 2 healthy HDDs crashed at the same time.
I have run into errors before. Normally you are able to reset the eventual errors and move on.
Sometimes zfs error happen at a powerout for example...
I am outta ideas I'm afraid.
Maybe create a new PBS and a new array
 

Attachments

  • ksnip_20251021-231130.jpg
    ksnip_20251021-231130.jpg
    10.5 KB · Views: 2
Last edited:
Sometimes zfs error happen at a powerout for example...
I am outta ideas I'm afraid.
Maybe create a new PBS and a new array
A poweroutage is a really good candidate for a new zpool when was writing to at as it's even a highly mentioned zfs feature and seen by yourself how it works.
 
A poweroutage is a really good candidate for a new zpool when was writing to at as it's even a highly mentioned zfs feature
I hope I'm misunderstanding.
Are you saying that a power outage can make a ZFS pool irreversibly damaged? :-(
Could you share a link to this "highly mentioned zfs feature"?
 
  • Like
Reactions: Chamonix73
Tomorrow I will try to physically unplug 1 hdd at a time to see if that makes any difference. All attemts to import the pool results in kernel error and a suspended pool. And if I don't import I cannot rebuild of list it.
If that does not do it I will rebuild the array with new backups

I thought a mirror was safe(r) to use.
 
Last edited:
Was customer 350TB production fileserver crash after poweroutage ... needed a new pool after impossible import (x options tried) with full data replay which was a "real long run/fun" for all.
 
I tried unplugging each disk and tried to import/rebuild to another disk of same size. It did not work.
Pool shows as "degraded" (as it should since it's missing a member) when I unplug 1 disk.
With only 1 disk member plugged in: 1 disk shows as "corrupted".
The other shows "online" but when I tried to import it and rebuild I get kernel errors of different kind.
So... there is no way to rebuild or recreate the data?