Issue after upgrade to 7.2.3

Stoiko Ivanov · May 5, 2022

The issue is tracked here:
https://bugzilla.proxmox.com/show_bug.cgi?id=4033

we have a new pve-kernel-5.15.35-1-pve package on the pvetest repository, which contains a commit that should fix the issue with QNAPs nfs-server implementation

would be great if you could try this kernel (and remove the workaround with setting the nfs-version beforehand)

Thanks!

t.lamprecht · May 5, 2022

Stoiko Ivanov said:
we have a new pve-kernel-5.15.35-1-pve package

For the record, the package version for the pve-kernel-5.15.35-1-pve package needs to be 5.15.35-2 for the fix.

You need to add the pvetest repository to get it:
https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_test_repo
That can also be done over the web interface (Nodes -> Repositories -> Add)

mir · May 5, 2022

t.lamprecht said:
For the record, the package version for the pve-kernel-5.15.35-1-pve package needs to be 5.15.35-2 for the fix.

As can be read here: https://forum.proxmox.com/threads/proxmox-ve-7-2-released.108970/post-468736
kernel pve-kernel-5.15.30-2-pve does not fix the issue. As a remark I have a Qnap which have a iscsi and nfs mount but have other storages as well but apparently if the Qnap mount fails the GUI freezes but under the hood everything works managing the cluster via CLI. For the record mounted with NFS version 4.2.

t.lamprecht · May 5, 2022

mir said:
kernel pve-kernel-5.15.30-2-pve

That's the wrong one though, .30-2 vs .35-1, only the latter includes a proposed fix and is only available in pvetest at the moment.

mir · May 5, 2022

t.lamprecht said:
That's the wrong one though, .30-2 vs .35-1, only the latter includes a proposed fix and is only available in pvetest at the moment.

Sorry, missed that info ;-)

New information:
Only problems with NFS mounts to Qnap.
NFSv4.2 does not work
NFSv4.1 does not work
NFSv4.0 work
NFSv3 does not work.

mir · May 5, 2022

mir said:
Sorry, missed that info ;-)

New information:
Only problems with NFS mounts to Qnap.
NFSv4.2 does not work
NFSv4.1 does not work
NFSv4.0 work
NFSv3 does not work.

A question: Is it the optimal way that mounts should be blocking?
Would it not be better if mounts was not blocking and used a timeout instead and in case mount observed timeout refuse starting VM's/PCT's with block devices on these mounts?

minosandro · May 5, 2022

Confirmed: I have the same problem after the upgrade, and I'm also using a Qnap NFS share file system.

Rafuz · May 5, 2022

Same problem here, I have a QNAP NAS and some NFS mounts.
The GUI freezes but the host is working. Only forcing NFS4 in mounts works right now.

mazay · May 6, 2022

Stoiko Ivanov said:
* Any chance someone still has a complete journal of a boot where the qnap did not mount correctly?
Trying to see if the issue is related to:
https://bugzilla.redhat.com/show_bug.cgi?id=2055362
https://bbs.archlinux.org/viewtopic.php?id=274259

is it still actual?

t.lamprecht · May 6, 2022

mir said:
Would it not be better if mounts was not blocking and used a timeout instead and in case mount observed timeout refuse starting VM's/PCT's with block devices on these mounts?

There is, but as mentioned elsewhere, those processes are in uninterruptible sleep, and no timeout will help you there, as well, it just cannot be interrupted. There are ideas on how to improve on that, but it isn't trivial and would change existing architecture quite a bit.

t.lamprecht · May 6, 2022

We got enough reports on QNAP NFS shares not working, thanks for that. What we would now appreciate is someone testing the new kernel from pvetest that got a proposed potential fix for this and disable the NFS version pinning to see if it works, feedback on that would allow us to provide the solution for more people faster, at least if it indeeed fixes it.

StreetPiet · May 6, 2022

Confirmed: I had the same Problems with a QNAP TS-1677XU-RP after a Update from Proxmox 7.1-12 to 7.2-3

mounting NFS V4.0 by hand worked.
activating the test package-repository and upgrading the pve-kernel package also worked.

mirkot · May 6, 2022

t.lamprecht said:
We got enough reports on QNAP NFS shares not working, thanks for that. What we would now appreciate is someone testing the new kernel from pvetest that got a proposed potential fix for this and disable the NFS version pinning to see if it works, feedback on that would allow us to provide the solution for more people faster, at least if it indeeed fixes it.

After update to kernel 5.15.35-2 problems with QNAP share are gone. Good work

.

edgar.elizeche · May 6, 2022

After upgrading from 6.4 to 7.2 on a Dell T320 the OS will not boot while VT is active in the BIOS. I also did tests installing 7.2 directly, it only works when the bios VT is off.

The error is the following and the OS does not boot

Error: DMAR: [DMA Read] Request device [02:00.0] fault addr bd42c000 [fault reason 06] PTE Read access is not set

Installing 7.1 works correctly, but upgrading to 7.2 has the same error if virtualization is not disabled in the BIOS

mijanek · May 7, 2022

I have run the Upgrade now form pve-nosubscribtion containig the pve-kernel-5.15.35-1-pve: 5.15.35-2 package and I don't have any issue with my QNAP (TS-253D Ver. 5.0.0.1986) mount with NFS left with Default.

However I run in the Upgrade process into apt issue with libproxmox-rs-perl (0.1.0). This could be fixed by apt --fix-broken install for me. This was affected on all my 3 Nodes..

Here the Upgrade log:

Code:

Starting system upgrade: apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages were automatically installed and are no longer required:
  pve-kernel-5.13.19-1-pve pve-kernel-5.13.19-3-pve pve-kernel-5.15.7-1-pve
Use 'apt autoremove' to remove them.
The following NEW packages will be installed:
  libdrm-common libdrm2 libepoxy0 libgbm1 libproxmox-rs-perl libvirglrenderer1
  libwayland-server0 pve-kernel-5.15.35-1-pve
The following packages will be upgraded:
  base-files bind9-dnsutils bind9-host bind9-libs dirmngr dnsutils gnupg gnupg-l10n
  gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server gpgconf gpgsm gpgv gzip
  libarchive13 libc-bin libc-dev-bin libc-devtools libc-l10n libc6 libc6-dev
  libexpat1 libflac8 liblzma5 libnss-systemd libnvpair3linux libpam-systemd
  libproxmox-acme-perl libproxmox-acme-plugins libpve-access-control
  libpve-cluster-api-perl libpve-cluster-perl libpve-common-perl
  libpve-guest-common-perl libpve-rs-perl libpve-storage-perl libpve-u2f-server-perl
  libssl1.1 libsystemd0 libtiff5 libudev1 libuutil3linux libxml2 libzfs4linux
  libzpool5linux linux-libc-dev locales lxc-pve lxcfs novnc-pve openssl
  proxmox-backup-client proxmox-backup-file-restore proxmox-ve
  proxmox-widget-toolkit pve-cluster pve-container pve-docs pve-firmware
  pve-ha-manager pve-i18n pve-kernel-5.13.19-6-pve pve-kernel-5.15 pve-kernel-helper
  pve-manager pve-qemu-kvm qemu-server smartmontools spl systemd systemd-sysv
  sysvinit-utils tasksel tasksel-data tzdata udev usb.ids xz-utils zfs-initramfs
  zfs-zed zfsutils-linux zlib1g
85 upgraded, 8 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB of archives.
After this operation, 408 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
Fetched 295 MB in 39s (7641 kB/s)                                                   
Reading changelogs... Done
Extracting templates from packages: 100%
Preconfiguring packages ...
(Reading database ... 117320 files and directories currently installed.)
...
Selecting previously unselected package libproxmox-rs-perl.
Preparing to unpack .../46-libproxmox-rs-perl_0.1.0_amd64.deb ...
Unpacking libproxmox-rs-perl (0.1.0) ...
dpkg: error processing archive /tmp/apt-dpkg-install-Hfc03K/46-libproxmox-rs-perl_0.1.0_amd64.deb (--unpack):
 trying to overwrite '/usr/share/perl5/PVE/RS/CalendarEvent.pm', which is also in package libpve-rs-perl 0.5.1
Preparing to unpack .../47-novnc-pve_1.3.0-3_all.deb ...
Unpacking novnc-pve (1.3.0-3) over (1.3.0-2) ...
Preparing to unpack .../48-proxmox-widget-toolkit_3.4-10_all.deb ...
Unpacking proxmox-widget-toolkit (3.4-10) over (3.4-7) ...
Preparing to unpack .../49-pve-docs_7.2-2_all.deb ...
Unpacking pve-docs (7.2-2) over (7.1-2) ...
Preparing to unpack .../50-pve-i18n_2.7-1_all.deb ...
Unpacking pve-i18n (2.7-1) over (2.6-2) ...
Preparing to unpack .../51-pve-manager_7.2-3_amd64.deb ...
Unpacking pve-manager (7.2-3) over (7.1-10) ...
Preparing to unpack .../52-libpve-rs-perl_0.6.1_amd64.deb ...
Unpacking libpve-rs-perl (0.6.1) over (0.5.1) ...
Preparing to unpack .../53-libpve-access-control_7.1-8_all.deb ...
Unpacking libpve-access-control (7.1-8) over (7.1-6) ...
...
Preparing to unpack .../70-zfs-zed_2.1.4-pve1_amd64.deb ...
Unpacking zfs-zed (2.1.4-pve1) over (2.1.2-pve1) ...
Errors were encountered while processing:
 /tmp/apt-dpkg-install-Hfc03K/46-libproxmox-rs-perl_0.1.0_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

Your System is up-to-date


Seems you installed a kernel update - Please consider rebooting
this node to activate the new kernel.

mikekgr · May 7, 2022

Dear Sirs,
I experienced the following problem on my Dell PowerEdge R730 server, (with hardware RAID controller PERC H730), that updated from 7.1 to latest 7.2. During boot (using the latest kernel 5.15.35.-1-pve) I saw many errors on boot screen, ragarding RAID SAS kernel module witch is megaraid_sas). Although the errors, the server seems to work fine. If I boot with the previous kernel (5.13.19-6-pve), no errors occurred and the server still functional.
What do you think about? Is it recommended to stick to "older" kernel (5.13.19-6-pve)? Can I force the pve to stick with me previous kernel, if yes, how?
For your reference I attached here the dmesg output as well as the megaraid_sas module info for both tested kernels 5.15.35.-1-pve and 5.13.19-6-pve
Thanks and Best Regards,
Mike Kranidis

dafydd · May 7, 2022

mikekgr said:
I experienced the following problem on my Dell PowerEdge R730 server, (with hardware RAID controller PERC H730), that updated from 7.1 to latest 7.2. During boot (using the latest kernel 5.15.35.-1-pve) I saw many errors on boot screen, ragarding RAID SAS kernel module witch is megaraid_sas). Although the errors, the server seems to work fine. If I boot with the previous kernel (5.13.19-6-pve), no errors occurred and the server still functional.
What do you think about? Is it recommended to stick to "older" kernel (5.13.19-6-pve)? Can I force the pve to stick with me previous kernel, if yes, how?

I had the same issue with a Dell server, it couldn't boot up and complained about a missing megaraid module. You can pin the previous kernel with the following command which sets the default kernel in grub.

proxmox-boot-tool kernel pin 5.13.19-6-pve

Hallow · May 7, 2022

dafydd said:
I had the same issue with a Dell server, it couldn't boot up and complained about a missing megaraid module. You can pin the previous kernel with the following command which sets the default kernel in grub.

proxmox-boot-tool kernel pin 5.13.19-6-pve

Hey, not op but how can I unpin kernel later ? Also maybe you have link to related docs

dafydd · May 7, 2022

Hallow said:
Hey, not op but how can I unpin kernel later ? Also maybe you have link to related docs

Check the wiki docs here - https://pve.proxmox.com/wiki/Host_Bootloader

You can unpin with proxmox-boot-tool kernel unpin

tonci · May 7, 2022

Hi to all,
today I did 3-node-cluster upgrade from 7.1.x to 7.2.3 (with community subs) and the process broke up on every node. After Apt update / dist-upgrade (that obviously went with errors) I rebooted host , but than OpenVswitch network went down and I had to switch back to linux bridge (thanks to IPMI

in order to get host back to network (and internet because of potential dependencies). Right after I repeated dist-upgrade and install -f and than pve-host got back online and showed up in cluster again. Than I copied back openvs interfaces configuration, rebooted once more , everything went ok and cluster got up and running healthy again.
Find attached syslog for further investigation what was wrong with openvswitch ...

BR to all
Tonci

Issue after upgrade to 7.2.3

Proxmox Staff Member

Proxmox Staff Member

Famous Member

Proxmox Staff Member

Famous Member

Famous Member

Member

Active Member

Member

Proxmox Staff Member

Proxmox Staff Member

Member

Member

New Member

Renowned Member

Member

Attachments

Renowned Member

Member

Renowned Member

Renowned Member

Attachments

We value your privacy