Unbootable PVE after upgrade to 4.13.4-1-pve

DevFx11

New Member
Oct 25, 2017
16
1
3
41
Hi

I have a Proxmox system running latest proxmox with kernel 4.10.17-4-pve
I upgraded the system , it upgraded zfs also.
On the next reboot i was unable to boot into the system.
I got a lot of errors , tried a lot of fixes, nothing worked.
After somehow getting past "Waiting for udev to inittialize all hardware" and for lvm2 eternal wait times (which i dont even use and have disabled) , i finally after somehow getting a prompt, i issue zpool import and it hangs, waiting a lot and nothing hapening and cant even control+c also blkid hangs, fdisk -l hangs also.
Then i tried to revert back to my old kernel and suddenly, after some minor fixes, separate pool for data, i managed to make it work again, and i am very glad it works since its a production system, small but still needed. We do not have a subscription, and i don't know if a subscription would have helped or had the same effect. We are thinking about getting one in the future.
Could anyone please tell me what is wrong with the upgraded kernel ? It seems to be an issue with the kernel since now it works well with the old one.
One minor thing (or not so minor) is the zfs-zed -F now is using 100% of 1 of my cpu cores.
 
Last edited:
please try the following:
  • boot into 4.13.4-1-pve by pressing 'e' in the Grub menu, and adding 'debug break=mount' to the line starting with 'linux'
  • run modprobe zfs
  • zpool import
  • lsblk
and post the resulting output together with the content of /run/initramfs/initramfs.debug and the output of "dmesg"
 
Hi
Thanks for the reply.
I can not test now.
This is a production environment i can not test this now, maybe in the weekend.
Meanwhile i am using the old kernel which thank god works just fine.
But now email is broken also

Oct 25 10:44:30 pve1 postfix/qmgr[2691]: 37DCC1317A: from=<root@pve1.mydomain.local>, size=432, nrcpt=1 (queue active)
Oct 25 10:44:30 pve1 pmxcfs[2736]: [ipcs] crit: connection from bad user 65534! - rejected
Oct 25 10:44:30 pve1 pmxcfs[2736]: [libqb] error: Error in connection setup (2736-3615-28): Unknown error -1 (-1)
Oct 25 10:44:30 pve1 pmxcfs[2736]: [ipcs] crit: connection from bad user 65534! - rejected
Oct 25 10:44:30 pve1 pmxcfs[2736]: [libqb] error: Error in connection setup (2736-3615-28): Unknown error -1 (-1)
Oct 25 10:44:30 pve1 pmxcfs[2736]: [ipcs] crit: connection from bad user 65534! - rejected
Oct 25 10:44:30 pve1 pmxcfs[2736]: [libqb] error: Error in connection setup (2736-3615-28): Unknown error -1 (-1)
Oct 25 10:44:30 pve1 pmxcfs[2736]: [ipcs] crit: connection from bad user 65534! - rejected
Oct 25 10:44:30 pve1 pmxcfs[2736]: [libqb] error: Error in connection setup (2736-3615-28): Unknown error -1 (-1)
Oct 25 10:44:30 pve1 pmxcfs[2736]: [ipcs] crit: connection from bad user 65534! - rejected
Oct 25 10:44:30 pve1 pmxcfs[2736]: [libqb] error: Error in connection setup (2736-3615-28): Unknown error -1 (-1)
Oct 25 10:44:30 pve1 pvemailforward[3615]: mail forward failed: user 'root@pam' does not have a email address

root@pam does have an email address, my email works fine to other addresses from the host, just not to root !
i am sending mail using a relayhost.
I also tried removing the email from the root@pam user and readding it, on the webgui interface. Nothing changed
I can not send mail to root using echo "MYMAIL" | mail -s "MY SUBJECT" root
So every email sent to root won't be working neither, zfs scrub and whatever else mail there might be.
Everything worked fine before and everything breaks after the update.
I upgraded several times before this and everything was working fine after each upgrade.
This is one problematic update i must say
 
Meanwhile i fixed my non-working email ... its a bug :

https://forum.proxmox.com/threads/p...root-pam-does-not-have-a-email-address.35961/
"this is a bug and will be fixed with the next update"
dcsapak, Jul 28, 2017

The fix is :
chmod g+s /usr/bin/pvemailforward

The forum says the bug will be fixed in the next upgrade, but apparently its still there.
Just in case someone else also bumps into this the above mentioned fix works properly.
Now email is working again, huhh, getting at it one step at a time.

But now i start to think what if this happens again and again and again at each upgrade.
Im supposed to play around hours/days with upgrades that should be working out of the box ?
 
I am also geting an error with zpool events and zfs-zed is using 100% cpu

root@pve1:~# zpool events
TIME CLASS
internal error: Bad file descriptor
Aborted

It is as described here :
https://github.com/zfsonlinux/zfs/issues/4720

root@pve1:~# dmesg | grep -i zfs
[ 0.000000] Command line: BOOT_IMAGE=/ROOT/pve-1@/boot/vmlinuz-4.10.17-4-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs
[ 0.000000] Kernel command line: BOOT_IMAGE=/ROOT/pve-1@/boot/vmlinuz-4.10.17-4-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs
[ 9.312493] SPL: Loaded module v0.6.5.11-1
[ 9.329710] ZFS: Loaded module v0.6.5.11-1, ZFS pool version 5000, ZFS filesystem version 5
 
I am also geting an error with zpool events and zfs-zed is using 100% cpu

root@pve1:~# zpool events
TIME CLASS
internal error: Bad file descriptor
Aborted

It is as described here :
https://github.com/zfsonlinux/zfs/issues/4720

root@pve1:~# dmesg | grep -i zfs
[ 0.000000] Command line: BOOT_IMAGE=/ROOT/pve-1@/boot/vmlinuz-4.10.17-4-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs
[ 0.000000] Kernel command line: BOOT_IMAGE=/ROOT/pve-1@/boot/vmlinuz-4.10.17-4-pve root=ZFS=rpool/ROOT/pve-1 ro root=ZFS=rpool/ROOT/pve-1 boot=zfs
[ 9.312493] SPL: Loaded module v0.6.5.11-1
[ 9.329710] ZFS: Loaded module v0.6.5.11-1, ZFS pool version 5000, ZFS filesystem version 5

this is because you are mixing old ZFS kernel module and new ZFS user space, which is not supported
 
Meanwhile i fixed my non-working email ... its a bug :

https://forum.proxmox.com/threads/p...root-pam-does-not-have-a-email-address.35961/
"this is a bug and will be fixed with the next update"
dcsapak, Jul 28, 2017

The fix is :
chmod g+s /usr/bin/pvemailforward

The forum says the bug will be fixed in the next upgrade, but apparently its still there.
Just in case someone else also bumps into this the above mentioned fix works properly.
Now email is working again, huhh, getting at it one step at a time.

But now i start to think what if this happens again and again and again at each upgrade.
Im supposed to play around hours/days with upgrades that should be working out of the box ?

it was fixed, but seems like the same problem was now re-introduced via our build environment. we are working on a fix.
 
this is because you are mixing old ZFS kernel module and new ZFS user space, which is not supported
I understand, thanks.
The problem is i have no choice since the new kernel has problems with zfs import and gave me lvm2 error which i already disabled and was unable to issue zpool import , blkid and fdisk -l commands, it would all hang there doing nothing.
 
yes, I know. you'll need to get to the bottom of that though (see my earlier post about next steps to debug), because 4.10 is EOL..
 
Hi,

Updated to proxmox 5.1 and same issue, system cant start, only works if i use old kernel, but i cant startup any vm. Zfs isn´t mounted.

Best regards.
 
yes, I know. you'll need to get to the bottom of that though (see my earlier post about next steps to debug), because 4.10 is EOL..
Yes indeed, i would like to get to the bottom of this.
But as you can see I am not the only one with the problem.
I will have to make time and arrengments to test this weekend.
From now on i have to clone this system and upgrade first the test environment. But that needs extra hw, which i dont have, dont need :)
 
@fabian hi
yes, I know. you'll need to get to the bottom of that though (see my earlier post about next steps to debug), because 4.10 is EOL..
yes i know but i have not had enough time to get to the bottom of this yet

and my system suffers, zfs send does not work anymore probably of this mixmatch, zfs-zed takes up 100% cpu on one core ...

this update was more like a "backgrade" for me
 
Sorry for not replying for so long.
I will use this 4.10 kernel meanwhile.
I had loads of other issues and this had to be put on hold :)
But since it is pretty much important i will try and test with this.
Thanks a lot.
 
to help trouble shoot upgrading issues, we built a 4.10-based kernel including ZFS 0.7.3, available on pvetest. you need to manually download and install it, as no meta-package pulls it in automatically:
http://download.proxmox.com/debian/...pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

Code:
MD5:
1e511994999244e47b8e5a1fcce82cee  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb
SHA256:
5b903b467445bb9ae8fd941dfebf5ad37e8f979df08a9257dd087f4be718fb20  pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb

a 4.13.4 kernel with ZFS 0.7.3 is available in pvetest as well (pulled in automatically on upgrading if you have pvetest enabled). ZFS userspace packages are updated to 0.7.3 as well in pvetest, so make sure to upgrade those as well when testing either of the updated kernels.


Hi

Thank you for the updated kernel.

Downloaded and installed using apt-get install ./pve-kernel-4.10.17-5-pve_4.10.17-25_amd64.deb
I have updated my system with this new kernel 4.10.17-5 and zfs-zed once again WORKS PROPERLY , the 100% cpu usage is gone.
Also all the zfs commands works properly now. zfs send / recv.

I am still in doubt on upgrading to latest kernel, 4.13 due to possible unbootable system.
I guess there is only one way to find out :) to try it. That will have to wait.

Thanks again, everything works fine now.

Best Regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!