zpool failed to import on reboot

dswartz

Renowned Member
Dec 13, 2010
286
9
83
SATA raid1 and NVME raid1. Rebooted. NVME pool not present. Imported manually and all was well. Looked at /var/log/syslog and saw this:

May 1 17:23:20 pve zpool[1036]: internal error: Value too large for defined data type
May 1 17:23:20 pve systemd[1]: zfs-import-cache.service: main process exited, code=killed, status=6/ABRT
May 1 17:23:20 pve systemd[1]: Failed to start Import ZFS pools by cache file.
May 1 17:23:20 pve systemd[1]: Unit zfs-import-cache.service entered failed state.

Let me know what information I can provide (I am able to reboot the host if necessary...)
 
please always include "pversion -v" when reporting potential problems. how did you create this pool? please also post "zpool get all"
 
Apologies for forgetting that. So...

proxmox-ve: 4.4-86 (running kernel: 4.4.49-1-pve)
pve-manager: 4.4-13 (running version: 4.4-13/7ea56165)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.44-1-pve: 4.4.44-84
pve-kernel-4.4.49-1-pve: 4.4.49-86
lvm2: 2.02.116-pve3
corosync-pve: 2.4.2-2~pve4+1
libqb0: 1.0.1-1
pve-cluster: 4.0-49
qemu-server: 4.0-110
pve-firmware: 1.1-11
libpve-common-perl: 4.0-94
libpve-access-control: 4.0-23
libpve-storage-perl: 4.0-76
pve-libspice-server1: 0.12.8-2
vncterm: 1.3-2
pve-docs: 4.4-4
pve-qemu-kvm: 2.7.1-4
pve-container: 1.0-97
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.5.2-2+deb8u3
lxc-pve: 2.0.7-4
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-9
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.9-pve15~bpo80

I created the nvme pool manually, IIRC.
 
I used /dev/nvme0n1 and /dev/nvme1n1. nvme devices are treated specially, so this was my only option...
 
what does the following say:
Code:
journalctl -u "zfs*" --since "2017-05-01" --until "2017-05-02"
 
-- Logs begin at Mon 2017-05-01 17:23:05 EDT, end at Wed 2017-05-03 09:12:53 EDT
May 01 17:23:06 pve systemd[1]: Starting Import ZFS pools by cache file...
May 01 17:23:15 pve zpool[1036]: internal error: Value too large for defined dat
May 01 17:23:15 pve systemd[1]: zfs-import-cache.service: main process exited, c
May 01 17:23:15 pve systemd[1]: Failed to start Import ZFS pools by cache file.
May 01 17:23:15 pve systemd[1]: Unit zfs-import-cache.service entered failed sta
May 01 17:23:15 pve systemd[1]: Starting Mount ZFS filesystems...
May 01 17:23:15 pve systemd[1]: Started Mount ZFS filesystems.
May 01 17:23:20 pve systemd[1]: Starting ZFS file system shares...
May 01 17:23:21 pve systemd[1]: Started ZFS file system shares.
 
Code:
zpool set cachefile=none rpool
zpool set cachefile=/etc/zfs/zpool.cache rpool
update-initramfs -u
reboot
 
It isn't rpool. Also, none of this fails until the system is well up (well past initramfs being in play...)
 
pool: nvme
state: ONLINE
scan: scrub repaired 0 in 0h58m with 0 errors on Sun Apr 30 01:23:10 2017
config:

NAME STATE READ WRITE CKSUM
nvme ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme0n1 ONLINE 0 0 0
nvme1n1 ONLINE 0 0 0

how so?
 
root@pve:~# fdisk -l | grep nvme
Partition 3 does not start on physical sector boundary.

Disk /dev/nvme1n1: 953.9 GiB, 1024209543168 bytes, 2000409264 sectors
Disk /dev/nvme0n1: 953.9 GiB, 1024209543168 bytes, 2000409264 sectors
Partition 3 does not start on physical sector boundary.

I have never had an issue using whole-disk for zfs pool members. Why would nvme be special? And why does it work if I import the pool manually?
 
Code:
Disk /dev/nvme0n1: 477 GiB, 512110190592 bytes, 1000215216 sectors
/dev/nvme0n1p1         34       2047       2014 1007K BIOS boot
/dev/nvme0n1p2       2048 1000198797 1000196750  477G Solaris /usr & Apple ZFS
/dev/nvme0n1p9 1000198798 1000215182      16385    8M Solaris reserved 1
Disk /dev/nvme1n1: 477 GiB, 512110190592 bytes, 1000215216 sectors
/dev/nvme1n1p1         34       2047       2014 1007K BIOS boot
/dev/nvme1n1p2       2048 1000198797 1000196750  477G Solaris /usr & Apple ZFS
/dev/nvme1n1p9 1000198798 1000215182      16385    8M Solaris reserved 1

For ZFS used 0n1p2 1n1p2
 
Are you try clean zpool cachefile with nvme name?

zpool set cachefile=none nvme
zpool set cachefile=/etc/zfs/zpool.cache nvme
 
I can try both of those, but I'd like to know why they are causing an 'internal error' :)
 
@ivensiya : please don't confuse this issue further ;) you have a rpool, which needs partitions other wise it is not bootable. @dswartz has a non-rpool pool, so the full disk is usable there (and no, this should not be a problem, neither for regular disks nor for NVME devices).

@dswartz: maybe your cache file did get corrupted? you can delete it and set cachefile=none on all your pools to not use it at all, or you can delete it and export/import your pools to regenerate it (check the cachefile setting).

there are two regular ways for zpools to get imported on boot, which are mutually exclusive:
  • zfs-import-cache systemd unit, which will import pools using a cache file (so only those in that cache file get imported, and no scanning of block devices is done)
  • zfs-import-scan systemd unit, which will import all found pools on block devices (a bit more "expensive", and is only triggered if NO cache file exists)
in either case, PVE will import any configured pools on first use anyway if they are not yet imported - so if you only use your pools for guest storage, you actually don't need either of the systemd units. if you use them for other things as well, you can choose which of the units to use (or even write your own, which e.g. imports a specific pool with specific import options at a specific point in the boot process ;))
 
I didn't want to disable the cache import unit, so I deleted the cache file, exported, imported and rebooted (to test), and nvme was imported just fine. So I guess it got corrupted. Thanks for the fix :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!