[SOLVED] zfs: cannot import rpool after reboot

to help trouble shoot upgrading issues, we built a 4.10-based kernel including ZFS 0.7.3, available on pvetest. you need to manually download and install it, as no meta-package pulls it in automatically:
http://download.proxmox.com/debian/...pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

Code:
MD5:
1e511994999244e47b8e5a1fcce82cee  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb
SHA256:
5b903b467445bb9ae8fd941dfebf5ad37e8f979df08a9257dd087f4be718fb20  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

a 4.13.4 kernel with ZFS 0.7.3 is available in pvetest as well (pulled in automatically on upgrading if you have pvetest enabled). ZFS userspace packages are updated to 0.7.3 as well in pvetest, so make sure to upgrade those as well when testing either of the updated kernels.
 
  • Like
Reactions: GadgetPig
Sorry but I hook on that thread here as I believe I have the very same issue and I don't see any response here from the OP.
IMG_20171125_144946.jpg


All 4.13.(4-1,8-1,8-2) are causing the same issue as the screenshot.
pve-kernel-4.10.17-5-pve works flawless.
I'm running on Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
I have seem in other threads that there seem some issues around the kernel and some CPUs.

Since the OP went quiet, maybe I can help with further troubleshooting.

Andreas
 
My others are affected by a kernel bug & an updated 4.10 kernel helped with those.
Thanks, and it does work for me too but I wonder if there is work to get the rest running on 4.13 as well.
I can't find any kernel 4.13 fixing threat.
 
  • Like
Reactions: Dubard
to help trouble shoot upgrading issues, we built a 4.10-based kernel including ZFS 0.7.3, available on pvetest. you need to manually download and install it, as no meta-package pulls it in automatically:
http://download.proxmox.com/debian/...pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

Code:
MD5:
1e511994999244e47b8e5a1fcce82cee  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb
SHA256:
5b903b467445bb9ae8fd941dfebf5ad37e8f979df08a9257dd087f4be718fb20  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

a 4.13.4 kernel with ZFS 0.7.3 is available in pvetest as well (pulled in automatically on upgrading if you have pvetest enabled). ZFS userspace packages are updated to 0.7.3 as well in pvetest, so make sure to upgrade those as well when testing either of the updated kernels.


Hi

Thank you for the updated kernel.

Downloaded and installed using apt-get install ./pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb
I have updated my system with this new kernel 4.10.17-5 and zfs-zed once again WORKS PROPERLY , the 100% cpu usage is gone.
Also all the zfs commands works properly now. zfs send / recv.

I am still in doubt on upgrading to latest kernel, 4.13 due to possible unbootable system.
I guess there is only one way to find out :) to try it. That will have to wait.

Thanks again, everything works fine now.

Best Regards
 
  • Like
Reactions: Dubard
Sorry but I hook on that thread here as I believe I have the very same issue and I don't see any response here from the OP.
IMG_20171125_144946.jpg


All 4.13.(4-1,8-1,8-2) are causing the same issue as the screenshot.
pve-kernel-4.10.17-5-pve works flawless.
I'm running on Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
I have seem in other threads that there seem some issues around the kernel and some CPUs.

Since the OP went quiet, maybe I can help with further troubleshooting.

Andreas


Hi everybody,

I have the same problem that @Madhatter on a machine that I'm trying to integrate into my cluster !
I install Proxmox 5.1-3 from CD ROM and select zfs raid-1 storage type for rpool.

Here is the specification of my server:

Lenovo D30
  • 32 Go RAM
  • 24 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (2 sockets)
  • Disk used for ZFS rpool: 2 x SSD KINGSTON Hyperx 220 Go
  • Disk used for ZFS storage:
    • 1 WD20EARS Green - 2To SATA II 64Mo - 5400 RPM
    • 1 WDC WD20EFRX-68A Red - 2To SATA 6Gb/s 64 MB - 5400 RPM
  • ZFS network port: 1Gbit
[First problem]

My server I always arrive on the "busybox / initramfs" !
I have to reorder every time the command:
Code:
+++++ Into busybox / initramfs +++++

/ # zpool import -N -f rpool
/ # exit

...and server continue with error below:

[Second problem]

proxmox-boot-error.jpg


Can someone help me because my server doesn't boot ? : (

Thanks
 
try installing with the 5.0 iso (which contains a 4.10 based kernel) and then upgrade to the latest 4.13.13 kernel - there have been problems with some scsi controllers in earlier 4.13 kernels which showed similar symptoms.
 
  • Like
Reactions: Dubard
Hi everybody,

I have the same problem that @Madhatter on a machine that I'm trying to integrate into my cluster !
I install Proxmox 5.1-3 from CD ROM and select zfs raid-1 storage type for rpool.

Here is the specification of my server:

Lenovo D30
  • 32 Go RAM
  • 24 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz (2 sockets)
  • Disk used for ZFS rpool: 2 x SSD KINGSTON Hyperx 220 Go
  • Disk used for ZFS storage:
    • 1 WD20EARS Green - 2To SATA II 64Mo - 5400 RPM
    • 1 WDC WD20EFRX-68A Red - 2To SATA 6Gb/s 64 MB - 5400 RPM
  • ZFS network port: 1Gbit
[First problem]

My server I always arrive on the "busybox / initramfs" !
I have to reorder every time the command:
Code:
+++++ Into busybox / initramfs +++++

/ # zpool import -N -f rpool
/ # exit

...and server continue with error below:

[Second problem]

View attachment 6736


Can someone help me because my server doesn't boot ? : (

Thanks

Is there any downside staying with pve-kernel-4.10.17-5-pve kernel while 4.13 gets more stable ?
I wouldn't mind stability over features / performance / security.
 
This helped me as well. Seems my controller takes a few more seconds to intitialize all 10 disks. 5 seconds for both options were enough in my case.
Thank you very much!
 
I have the same issue with the added difficulty of having a read only file...
I tried to fsck the drive but can't since it is in use. So I am blocked in a loop.
I tried this:
/dev/sda3' is a physical volume of a volume group. Check with 'vgdisplay' a VG should appear (probably pve). You can activate that VG with 'vgchange -a y <VGname>', then those devices get mapped. The fsck then can run on these mapped devices (eg. /dev/mapper/pve).

but it failed as well.

My device structure looks like this:
control pve-data_tmeta pve-swap
pve-data pve-data-tpool pve-vm--200--disk--2
pve-data_tdata pve-root pve-vm--201--disk--1

and of course, anything I try is meant to fail since I'm ro:
update-grub
/usr/sbin/grub-mkconfig: 253: /usr/sbin/grub-mkconfig: cannot create /boot/grub/grub.cfg.new: Read-only file system

fdisk -l returns:
fdisk -l
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: D9EA50B0-C918-408F-A5BE-7C2842D47220

Device Start End Sectors Size Type
/dev/sda1 2048 4095 2048 1M BIOS boot
/dev/sda2 4096 528383 524288 256M EFI System
/dev/sda3 528384 488397134 487868751 232.6G Linux LVM


Disk /dev/mapper/pve-swap: 16 GiB, 17179869184 bytes, 33554432 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/pve-root: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/pve-vm--200--disk--2: 128 GiB, 137438953472 bytes, 268435456 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: dos
Disk identifier: 0xc2b3c2b3

Device Boot Start End Sectors Size Id Type
/dev/mapper/pve-vm--200--disk--2-part1 * 63 1124414 1124352 549M 7 HPFS/NT
/dev/mapper/pve-vm--200--disk--2-part2 1124550 201322693 200198144 95.5G 7 HPFS/NT

Partition 1 does not start on physical sector boundary.
Partition 2 does not start on physical sector boundary.


Disk /dev/mapper/pve-vm--201--disk--1: 48 GiB, 51539607552 bytes, 100663296 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: dos
Disk identifier: 0xb1d19382

Device Boot Start End Sectors Size Id Type
/dev/mapper/pve-vm--201--disk--1-part1 * 2048 92274687 92272640 44G 83 Linux
/dev/mapper/pve-vm--201--disk--1-part2 92276734 100661247 8384514 4G 5 Extende
/dev/mapper/pve-vm--201--disk--1-part5 92276736 100661247 8384512 4G 82 Linux s

Partition 2 does not start on physical sector boundary.
 
I have the same issue with the added difficulty of having a read only file...
I tried to fsck the drive but can't since it is in use. So I am blocked in a loop.
I tried this:


but it failed as well.

My device structure looks like this:


and of course, anything I try is meant to fail since I'm ro:


fdisk -l returns:

My syslog says the following (many errors) the system was working properly before the last upgrade:
Mar 28 06:25:01 pve2 liblogging-stdlog: [origin software="rsyslogd" swVersion="8.24.0" x-pid="21191" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Mar 28 06:26:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:26:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:26:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:26:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:26:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:26:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:26:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:27:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:27:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:28:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:28:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:28:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:28:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:28:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:28:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:28:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:29:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:29:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:30:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:30:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:30:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:30:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:30:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:30:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:30:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:31:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:31:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:32:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:32:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:32:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:32:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:32:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:32:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:32:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:33:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:33:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:34:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:34:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:34:21 pve2 rrdcached[1367]: flushing old values
Mar 28 06:34:21 pve2 rrdcached[1367]: rotating journals
Mar 28 06:34:21 pve2 rrdcached[1367]: started new journal /var/lib/rrdcached/journal/rrd.journal.1553751261.377963
Mar 28 06:34:21 pve2 rrdcached[1367]: removing old journal /var/lib/rrdcached/journal/rrd.journal.1553744061.377967
Mar 28 06:34:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:34:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:34:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:34:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:34:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:35:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:35:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:36:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:36:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:36:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
Mar 28 06:36:30 pve2 systemd[1]: Timed out waiting for device dev-zvol-rpool-swap.device.
Mar 28 06:36:30 pve2 systemd[1]: Dependency failed for /dev/zvol/rpool/swap.
Mar 28 06:36:30 pve2 systemd[1]: dev-zvol-rpool-swap.swap: Job dev-zvol-rpool-swap.swap/start failed with result 'dependency'.
Mar 28 06:36:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start failed with result 'timeout'.
Mar 28 06:37:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:37:01 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:38:00 pve2 systemd[1]: Starting Proxmox VE replication runner...
Mar 28 06:38:00 pve2 systemd[1]: Started Proxmox VE replication runner.
Mar 28 06:38:30 pve2 systemd[1]: dev-zvol-rpool-swap.device: Job dev-zvol-rpool-swap.device/start timed out.
 
Mike, have you tried running initramfs-update -u?

While it didn't work for me, considering you are able to completely boot after manually importing the pool this may be worth a try for you, as well as the rootdelay option in grub (hit "e" when grub shows, manually append "rootdelay=30" to the boot entry, F10 to boot).
There is no "root delay" option is Grub. Can I randomly add "root delay = 30" line anywhere I want in grub?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!