Proxmox 3.4 upgrade failure, stuck in busybox due to ZFS

chriswayg

Renowned Member
Oct 17, 2015
9
0
66
Hi,

after upgrading Proxmox 3.4 and rebooting, the system got stuck in busybox, failing to boot with:
Code:
Error: Failed to mount root filesystem 'rpool/ROOT/pve-1/'

When checking with 'mount', rpool actually shows up as mounted on /root, but it failed to boot anyways.

Proxmox 3.4 was installed with ZFS Raidz1 on two hard disks using the default ISO installer. It is using the Linux 3.10.0-11-pve kernel, which worked fine until and including the last Oct. 7 upgrades.

To recover I had to boot into a previous kernel Linux 2.6.32.40-pve (even Linux 2.6.32.42-pve failed to boot!). Then I rolled back the following ZFS related packages: (from zfsutils:0.6.5-1~wheezy, zfs-initramfs:0.6.5-1~wheezy, etc.)
Code:
apt-get install \
libuutil1:amd64=0.6.4-4~wheezy \
libnvpair1:amd64=0.6.4-4~wheezy \
libzpool2:amd64=0.6.4-4~wheezy \
libzfs2:amd64=0.6.4-4~wheezy \
spl:amd64=0.6.4-4~wheezy \
zfsutils:amd64=0.6.4-4~wheezy \
zfs-initramfs:amd64=0.6.4-4~wheezy

apt-mark hold libnvpair1 libuutil1 libzfs2 libzpool2 spl zfs-initramfs zfsutils

Possibly not all the packages were required to be rolled back, but I suspect especially zfs-initramfs or zfsutils, as there were similar problems during previous upgrades reported on this forum. Also, I expect it is better to keep all ZFS related packages at the same 0.6.4-4 version.

Has anyone else encountered this during a recent upgrade with ZFS and Linux 2.6.32.42-pve or Linux 3.10.0-11-pve? What was your solution?
 
if you run 3.10 kernel branch, you need to "manually" upgrade this kernel. only the default kernel branch updates automatically.

latest kernel form the 3.10 kernel branch:

> apt-get install pve-kernel-3.10.0-13-pve
 
Hi Tom, thanks for the hint. I did not expect, that I would have to manually track upgrades for the 3.10 kernel.

I upgraded to pve-kernel-3.10.0-13-pve, but the issue remains:

The following kernels will not boot with ZFS 0.6.5-1 (but they will boot with ZFS
0.6.4-4):
  • pve-kernel-2.6.32.42-pve
  • pve-kernel-3.10.0-13-pve
  • pve-kernel-3.10.0-11-pve

To recover I had to boot again into (and downgrade to ZFS 0.6.4-4 as explained):
  • pve-kernel-2.6.32.40-pve

Any idea, what else could be causing this problem?



 
Last edited:
The following kernels will not boot with ZFS 0.6.5-1 (but they will boot with ZFS 0.6.4-4):


  • pve-kernel-2.6.32.42-pve
  • pve-kernel-3.10.0-13-pve
Have you been able to reproduce this problem? Is this issue going to be fixed in a coming update?

Regards,
Christian


 
works here, so the question is what is different of your side.
 
Hi Tom,
well, if this issue does not show up on anyone elses's installation, it will be difficult to trace here and I will just keep the ZFS packages on hold. A fresh setup of Proxmox 3.4 might fix it, but I've already started the process of moving the server to a new install of Proxmox 4.0. Hopefully the issue will not recur there.

What is different on my side?

Its a pretty default ISO install of Proxmox 3.4 with ZFS Raid-Z1 on two hard disks:

  • Some additional Debian Wheezy utilities
  • Main change is running the pve-kernel-3.10
  • Plus I am running Docker 1.8 from the Docker repository. Could that cause problems?

Regards,
Christian
 
Last edited: