[SOLVED] Cannot boot into PVE... failed to start zfs-import...0m - Import ZFS pool local\x2dzfs

matrix1999

Member
Jan 10, 2023
47
8
13
I rebooted one of my cluster this afternoon and during the reboot I saw the message:

[FAILED] Failed to start zfs-import...0m - Import ZFS pool local\x2dzfs

I am currently running a pair of nvme with zfs mirror. There are no prompts for me to continue. How do I go about fixing this, please?

1751516582340.png
 
Last edited:
please boot a live CD, chroot into your install and post the full boot log
 
Thanks for your prompt reply. I am not a Proxmox expert and not sure how to use chroot.

I just boot with a live CD and here is the list of all devices for mount. nvme0 and nvme1 are the two mirror drives that I use for PVE.

I have read instructions online but not sure if I need to mount 1 or the other, or both since they are mirrored zfs drives. Any guideance would be much appreciated.

1751516682660.png
 
Last edited:
something like this:

Code:
# create chroot dir
mkdir /target
# import, but not mount rpool, with all mountpoints prefixed with /target
zpool import -N -R /target /target
# mount root dataset to /target (you need to add your dataset here - you can find it with "zfs list")
mount -t zfs rpool/... /target
# mount special FS
 mount -t proc proc /target/proc
mount -t sysfs sys /target/sysfs
mount -o bind /dev /target/dev
 mount -o bind /run /target/run
# chroot into chroot dir
chroot /target

and then you can run commands inside, like "journalctl -b-1" to obtain the last boot log, or "smartctl" to query disk status, and so on..
 
  • Like
Reactions: matrix1999
When I ran this:
Code:
zpool import -N -R /target /target
it said no datasets available.

And if I run
Code:
zpool import -N -R /target rpool
it says
Code:
last access by Proxmox hostid=xxxxxxxx at yyyyy
The pool can be imported, use 'zpool import -f' to import the pool.
is that what I am supposed to do?

I ran
Code:
zpool import
and this is what it shows. The rpool is my pve zfs environment.

1751336463517.png
 
Last edited:
sorry, yes, the double "/target" was a typo on my end. you need to use "-f" in this case, yes (and then once more when you first boot again after hopefully fixing the problem).
 
Thank you so much for your help so far. My next challenges after I import rpool byzpool import -N -R /target rpool -f. Sorry for so many questions but I want to be extra careful with chroot as I don't want to perform the wrong steps for a disastrous consequences.

More Questions:
1/ Here is my output from zfs list:
1751420221710.png

2/ I tried mount -t zfs rpool/ROOT /target and it says,
1751420306080.png

3/ From the first screenshot above, it looks like I have mounted rpool/ROOT to /target/rpool/ROOT already. If that's the case, is that good enough?

4/ After mount rpool to /target, do I still run the following:
Code:
mount -t proc proc /target/proc
mount -t sysfs sys /target/sysfs
mount -o bind /dev /target/dev
mount -o bind /run /target/run

5/ Lastly for chroot, is still still chroot /target? Or is it chroot /target/rpool/ROOT?
 
Last edited:
the root dataset is actually rpool/ROOT/pve-1 , so "chroot /target" is correct if that is mounted there. I guess you could also try "zfs mount rpool/ROOT/pve-1" if mounting with "mount -t zfs" doesn't work.
 
I was able to run journalctl -b-1 but it is very long (as expected). How can I go about exporting it to a USB stick.

For export I can run journalctl -b-1 > bootlog.txt. However, how do I go about copying that to a USB stick since I am chroot'ed?

PS: none of the commands below worked for me. I just ran chroot /target immediately after zpool import -N -R /target rpool -f and zfs mount rpool/ROOT/pve-1. It seemed to work for me (at least I was able to run journalctl -b-1.

Code:
mount -t proc proc /target/proc
mount -t sysfs sys /target/sysfs
mount -o bind /dev /target/dev
mount -o bind /run /target/run
 
Last edited:
you can plug in an USB drive ,and mount it inside the chroot dir before chrooting
 
  • Like
Reactions: matrix1999
it looks to me like it's actually the networking intialization that doesn't finish.. the ZFS pool warning seems to be real, but non-fatal..

do you have access to a (real or virtual) keyboard and screen during booting?
 
No, the screen just go black with no login or prompt. The only thing visible was the PVE version selection prompt. However, you have raised an interesting point. So the day before my server had this issue, I was changing the server's network MTU (from 1500 to 9000). So that must have violated something to broke the boot. So now that I can chroot, I'll see if I can revert back the MTU to original status and see if that fixes the issue.

Update: I edited \etc\network\interfaces to remove all MTU related settings, reboot the system and no luck. I also reboot using USB and tried Rescue Boot and still no luck. It says it cannot find rpool. So it looks like the underlying issue is still related to rpool not being recognized.

Update 2: I used an Ubuntu Live usb and was able to boot and have ethernet working, so it doesn't appear this to be a hardware related issue.

What else can I do to identify the core issue?

1751664137972.png
 
Last edited:
no, the rescue boot is just broken for ZFS.

so you only get the Grub menu for selecting the kernel, and nothing after? can you try adding "nomodeset" to the kernel commandline (press 'e' on the entry you want to attempt to boot, add it to the line starting with "linux" and follow the instructions at the bottom to attempt booting)
 
Here is my original kernel command line:
Code:
initrd=\EFI\proxmox\6.8.12-11-pve\initrd.img-6.8.12-11-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt nvme_core.default_ps_max_latency_us=0 pcie_acs_override=downstream,multifunction initcall_blacklist=sysfb_init video=simplefb:off video=vesafb:off video=efifb:off video=vesa:off disable_vga=1 vfio_iommu_type1.allow_unsafe_interrupts=1 kvm.ignore_msrs=1

I modified it by removing all the video passthrough, and added nomodeset by following your instructions:
Code:
initrd=\EFI\proxmox\6.8.12-11-pve\initrd.img-6.8.12-11-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt nvme_core.default_ps_max_latency_us=0 nomodeset

And here is what I got:
1751897702266.png

I then followed the on-screen prompt by running pool import rpool -f and then exit as suggested. That fixed my system and it booted up normally. I was able to then login via the web interfaces, ran updates and everything seems to be working. Thank you.

My last question is, do I need to run this every time I boot this node? Also, I assume I need to rerun proxmox-boot-tool refresh, correct (no need to run update-grub since the system is running zfs)?
 
Last edited:
no, that was just because you booted a live-cd inbetween. proxmox-boot-tool and other kernel/initrd/bootloader related invocations should happen automatically on updates, there is normally no need to run them manually.
 
  • Like
Reactions: matrix1999