Issue after upgrade to 7.2.3

The issue is tracked here:
https://bugzilla.proxmox.com/show_bug.cgi?id=4033

we have a new pve-kernel-5.15.35-1-pve package on the pvetest repository, which contains a commit that should fix the issue with QNAPs nfs-server implementation

would be great if you could try this kernel (and remove the workaround with setting the nfs-version beforehand)

Thanks!
 
  • Like
Reactions: StreetPiet
  • Like
Reactions: Stoiko Ivanov
For the record, the package version for the pve-kernel-5.15.35-1-pve package needs to be 5.15.35-2 for the fix.
As can be read here: https://forum.proxmox.com/threads/proxmox-ve-7-2-released.108970/post-468736
kernel pve-kernel-5.15.30-2-pve does not fix the issue. As a remark I have a Qnap which have a iscsi and nfs mount but have other storages as well but apparently if the Qnap mount fails the GUI freezes but under the hood everything works managing the cluster via CLI. For the record mounted with NFS version 4.2.
 
Last edited:
That's the wrong one though, .30-2 vs .35-1, only the latter includes a proposed fix and is only available in pvetest at the moment.
Sorry, missed that info ;-)

New information:
Only problems with NFS mounts to Qnap.
NFSv4.2 does not work
NFSv4.1 does not work
NFSv4.0 work
NFSv3 does not work.
 
Sorry, missed that info ;-)

New information:
Only problems with NFS mounts to Qnap.
NFSv4.2 does not work
NFSv4.1 does not work
NFSv4.0 work
NFSv3 does not work.
A question: Is it the optimal way that mounts should be blocking?
Would it not be better if mounts was not blocking and used a timeout instead and in case mount observed timeout refuse starting VM's/PCT's with block devices on these mounts?
 
Confirmed: I have the same problem after the upgrade, and I'm also using a Qnap NFS share file system.
 
Would it not be better if mounts was not blocking and used a timeout instead and in case mount observed timeout refuse starting VM's/PCT's with block devices on these mounts?
There is, but as mentioned elsewhere, those processes are in uninterruptible sleep, and no timeout will help you there, as well, it just cannot be interrupted. There are ideas on how to improve on that, but it isn't trivial and would change existing architecture quite a bit.
 
We got enough reports on QNAP NFS shares not working, thanks for that. What we would now appreciate is someone testing the new kernel from pvetest that got a proposed potential fix for this and disable the NFS version pinning to see if it works, feedback on that would allow us to provide the solution for more people faster, at least if it indeeed fixes it.
 
Confirmed: I had the same Problems with a QNAP TS-1677XU-RP after a Update from Proxmox 7.1-12 to 7.2-3

mounting NFS V4.0 by hand worked.
activating the test package-repository and upgrading the pve-kernel package also worked.
 
  • Like
Reactions: Stoiko Ivanov
We got enough reports on QNAP NFS shares not working, thanks for that. What we would now appreciate is someone testing the new kernel from pvetest that got a proposed potential fix for this and disable the NFS version pinning to see if it works, feedback on that would allow us to provide the solution for more people faster, at least if it indeeed fixes it.
After update to kernel 5.15.35-2 problems with QNAP share are gone. Good work ;).
 
After upgrading from 6.4 to 7.2 on a Dell T320 the OS will not boot while VT is active in the BIOS. I also did tests installing 7.2 directly, it only works when the bios VT is off.

The error is the following and the OS does not boot

Error: DMAR: [DMA Read] Request device [02:00.0] fault addr bd42c000 [fault reason 06] PTE Read access is not set

Installing 7.1 works correctly, but upgrading to 7.2 has the same error if virtualization is not disabled in the BIOS
 
Last edited:
I have run the Upgrade now form pve-nosubscribtion containig the pve-kernel-5.15.35-1-pve: 5.15.35-2 package and I don't have any issue with my QNAP (TS-253D Ver. 5.0.0.1986) mount with NFS left with Default.

However I run in the Upgrade process into apt issue with libproxmox-rs-perl (0.1.0). This could be fixed by apt --fix-broken install for me. This was affected on all my 3 Nodes..

Here the Upgrade log:

Code:
Starting system upgrade: apt-get dist-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages were automatically installed and are no longer required:
  pve-kernel-5.13.19-1-pve pve-kernel-5.13.19-3-pve pve-kernel-5.15.7-1-pve
Use 'apt autoremove' to remove them.
The following NEW packages will be installed:
  libdrm-common libdrm2 libepoxy0 libgbm1 libproxmox-rs-perl libvirglrenderer1
  libwayland-server0 pve-kernel-5.15.35-1-pve
The following packages will be upgraded:
  base-files bind9-dnsutils bind9-host bind9-libs dirmngr dnsutils gnupg gnupg-l10n
  gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server gpgconf gpgsm gpgv gzip
  libarchive13 libc-bin libc-dev-bin libc-devtools libc-l10n libc6 libc6-dev
  libexpat1 libflac8 liblzma5 libnss-systemd libnvpair3linux libpam-systemd
  libproxmox-acme-perl libproxmox-acme-plugins libpve-access-control
  libpve-cluster-api-perl libpve-cluster-perl libpve-common-perl
  libpve-guest-common-perl libpve-rs-perl libpve-storage-perl libpve-u2f-server-perl
  libssl1.1 libsystemd0 libtiff5 libudev1 libuutil3linux libxml2 libzfs4linux
  libzpool5linux linux-libc-dev locales lxc-pve lxcfs novnc-pve openssl
  proxmox-backup-client proxmox-backup-file-restore proxmox-ve
  proxmox-widget-toolkit pve-cluster pve-container pve-docs pve-firmware
  pve-ha-manager pve-i18n pve-kernel-5.13.19-6-pve pve-kernel-5.15 pve-kernel-helper
  pve-manager pve-qemu-kvm qemu-server smartmontools spl systemd systemd-sysv
  sysvinit-utils tasksel tasksel-data tzdata udev usb.ids xz-utils zfs-initramfs
  zfs-zed zfsutils-linux zlib1g
85 upgraded, 8 newly installed, 0 to remove and 0 not upgraded.
Need to get 295 MB of archives.
After this operation, 408 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
Fetched 295 MB in 39s (7641 kB/s)                                                   
Reading changelogs... Done
Extracting templates from packages: 100%
Preconfiguring packages ...
(Reading database ... 117320 files and directories currently installed.)
...
Selecting previously unselected package libproxmox-rs-perl.
Preparing to unpack .../46-libproxmox-rs-perl_0.1.0_amd64.deb ...
Unpacking libproxmox-rs-perl (0.1.0) ...
dpkg: error processing archive /tmp/apt-dpkg-install-Hfc03K/46-libproxmox-rs-perl_0.1.0_amd64.deb (--unpack):
 trying to overwrite '/usr/share/perl5/PVE/RS/CalendarEvent.pm', which is also in package libpve-rs-perl 0.5.1
Preparing to unpack .../47-novnc-pve_1.3.0-3_all.deb ...
Unpacking novnc-pve (1.3.0-3) over (1.3.0-2) ...
Preparing to unpack .../48-proxmox-widget-toolkit_3.4-10_all.deb ...
Unpacking proxmox-widget-toolkit (3.4-10) over (3.4-7) ...
Preparing to unpack .../49-pve-docs_7.2-2_all.deb ...
Unpacking pve-docs (7.2-2) over (7.1-2) ...
Preparing to unpack .../50-pve-i18n_2.7-1_all.deb ...
Unpacking pve-i18n (2.7-1) over (2.6-2) ...
Preparing to unpack .../51-pve-manager_7.2-3_amd64.deb ...
Unpacking pve-manager (7.2-3) over (7.1-10) ...
Preparing to unpack .../52-libpve-rs-perl_0.6.1_amd64.deb ...
Unpacking libpve-rs-perl (0.6.1) over (0.5.1) ...
Preparing to unpack .../53-libpve-access-control_7.1-8_all.deb ...
Unpacking libpve-access-control (7.1-8) over (7.1-6) ...
...
Preparing to unpack .../70-zfs-zed_2.1.4-pve1_amd64.deb ...
Unpacking zfs-zed (2.1.4-pve1) over (2.1.2-pve1) ...
Errors were encountered while processing:
 /tmp/apt-dpkg-install-Hfc03K/46-libproxmox-rs-perl_0.1.0_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

Your System is up-to-date


Seems you installed a kernel update - Please consider rebooting
this node to activate the new kernel.
 
Dear Sirs,
I experienced the following problem on my Dell PowerEdge R730 server, (with hardware RAID controller PERC H730), that updated from 7.1 to latest 7.2. During boot (using the latest kernel 5.15.35.-1-pve) I saw many errors on boot screen, ragarding RAID SAS kernel module witch is megaraid_sas). Although the errors, the server seems to work fine. If I boot with the previous kernel (5.13.19-6-pve), no errors occurred and the server still functional.
What do you think about? Is it recommended to stick to "older" kernel (5.13.19-6-pve)? Can I force the pve to stick with me previous kernel, if yes, how?
For your reference I attached here the dmesg output as well as the megaraid_sas module info for both tested kernels 5.15.35.-1-pve and 5.13.19-6-pve
Thanks and Best Regards,
Mike Kranidis
 

Attachments

  • megaraid_sas_5.15.35-1-pve.txt
    4.1 KB · Views: 5
  • dmesg_5.15.35.-1-pve.txt
    146.2 KB · Views: 2
  • dmesg_5.13.19-6-pve.txt
    113.6 KB · Views: 1
  • megaraid_sas_5.13.19-6-pve.txt
    4.1 KB · Views: 1
I experienced the following problem on my Dell PowerEdge R730 server, (with hardware RAID controller PERC H730), that updated from 7.1 to latest 7.2. During boot (using the latest kernel 5.15.35.-1-pve) I saw many errors on boot screen, ragarding RAID SAS kernel module witch is megaraid_sas). Although the errors, the server seems to work fine. If I boot with the previous kernel (5.13.19-6-pve), no errors occurred and the server still functional.
What do you think about? Is it recommended to stick to "older" kernel (5.13.19-6-pve)? Can I force the pve to stick with me previous kernel, if yes, how?
I had the same issue with a Dell server, it couldn't boot up and complained about a missing megaraid module. You can pin the previous kernel with the following command which sets the default kernel in grub.

proxmox-boot-tool kernel pin 5.13.19-6-pve
 
I had the same issue with a Dell server, it couldn't boot up and complained about a missing megaraid module. You can pin the previous kernel with the following command which sets the default kernel in grub.

proxmox-boot-tool kernel pin 5.13.19-6-pve
Hey, not op but how can I unpin kernel later ? Also maybe you have link to related docs
 
  • Like
Reactions: mikekgr
Hi to all,
today I did 3-node-cluster upgrade from 7.1.x to 7.2.3 (with community subs) and the process broke up on every node. After Apt update / dist-upgrade (that obviously went with errors) I rebooted host , but than OpenVswitch network went down and I had to switch back to linux bridge (thanks to IPMI :) in order to get host back to network (and internet because of potential dependencies). Right after I repeated dist-upgrade and install -f and than pve-host got back online and showed up in cluster again. Than I copied back openvs interfaces configuration, rebooted once more , everything went ok and cluster got up and running healthy again.
Find attached syslog for further investigation what was wrong with openvswitch ...

BR to all
Tonci
 

Attachments

  • syslog.zip
    333.9 KB · Views: 5

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!