Proxmox VE 7.2 released!

Veidit · May 12, 2022

mylesw said:
Having some issues with the screen not working during installation on Dell R610 server. I've tried this with 4 different monitors at our data center, but was successful with Dell R630. It appears that the screen dimensions during installation (after the initial Proxmox installation choice screen, and then the windowed CLI screen that shows after it) blanks during installation, with the monitor stating that the screen dimensions are not compatible with the monitor. The monitors are typical 19" and 24" 4:3 and 16:9 ratio. I realize this may be something to do with the older servers, but what can be done to allow this installation to be completed?

P.S. Was able to install PVE 6 on that box just fine. Definitely something different in PVE 7.

I am getting that error on my R740XD as well. seems like the proxmox is using a to high or strange resolution perhaps?
It worked well with Proxmox 7.1

Gilberto Ferreira · May 12, 2022

enlar said:
Fixed downgrading to 5.13 kernel.

I have see a lot of friends that encounter a lot of issues using 5.15 kernel.
Perhaps an update to 5.16 or even 5.17 is welcome.

tjk · May 12, 2022

Anyone NOT having an issue with 7.2 release with enterprise subscription kernel?

Stoiko Ivanov · May 13, 2022

mylesw said:
Having some issues with the screen not working during installation on Dell R610 server. I've tried this with 4 different monitors at our data center, but was successful with Dell R630. It appears that the screen dimensions during installation (after the initial Proxmox installation choice screen, and then the windowed CLI screen that shows after it) blanks during installation, with the monitor stating that the screen dimensions are not compatible with the monitor. The monitors are typical 19" and 24" 4:3 and 16:9 ratio. I realize this may be something to do with the older servers, but what can be done to allow this installation to be completed?

P.S. Was able to install PVE 6 on that box just fine. Definitely something different in PVE 7.

The simplest and cleanest solutions in those cases are:
* install on top of plain debian bullseye - see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_11_Bullseye (unless you want ZFS on root)
* install 6.4 and dist-upgrade to 7.2 - see https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0 (if you want ZFS on root)

else open a new thread and share some screenshots where the issue shows (start the installer in debug mode - and share how far you get)

Veidit · May 13, 2022

Stoiko Ivanov said:
The simplest and cleanest solutions in those cases are:
* install on top of plain debian bullseye - see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_11_Bullseye (unless you want ZFS on root)
* install 6.4 and dist-upgrade to 7.2 - see https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0 (if you want ZFS on root)

else open a new thread and share some screenshots where the issue shows (start the installer in debug mode - and share how far you get)

Will start a new thread, thanks.
And also buy a subscription, I like this product!

need2gcm · May 13, 2022

tjk said:
Anyone NOT having an issue with 7.2 release with enterprise subscription kernel?

No issues specific to 7.2 here. Upgraded our four server cluster with Ceph storage without a single problem or outage.

tjk · May 13, 2022

need2gcm said:
No issues specific to 7.2 here. Upgraded our four server cluster with Ceph storage without a single problem or outage

What hardware are you running on?

need2gcm · May 14, 2022

tjk said:
What hardware are you running on?

Node1
- Dell R720xd
- 2x Intel Xeon E5-2650 v2
- 128GB RAM
- OS on 2x SHGS31-500GS-2 via ZFS RAID1
- 10x Seagate Nytro 1551 1.92TB for Ceph
Node2
- Supermicro H8DGU based
- 2x AMD Opteron 6348
- 64GB RAM
- OS on 2x WD1004FBYZ via ZFS RAID1
- 6x Seagate Nytro 1551 1.92TB for Ceph
Node3
- Dell R750
- 2x Intel Xeon Gold 6346
- 128GB RAM
- OS on 2x SHGS31-500GS-2 via ZFS RAID1
- 6x Seagate Nytro 1551 1.92TB for Ceph
Node4
- Supermicro X9DRW based
- 1x Intel Xeon E5-2609
- 64GB RAM
- OS on 2x WD1500BLHX via ZFS RAID1
- 3x Seagate Nytro 1551 1.92TB for Ceph
- 4x Seagate Nytro 1551 1.92TB hot spares for Ceph (poor thing can't run more without getting overloaded)

Optimal? No. Stable? Very. The X9DRW is next up for replacement, will likely live on as an offsite PBS sync target.

EDIT: This doesn't address networking. Currently VMs and Cluster use the same network (active-backup 2x1Gb bonded interfaces), but moving VMs to a separate network is WIP. Ceph is running on 10Gb switches and active-backup 2x10Gb bonded DACs. The 10Gb switches also act as primary backbone for other switches in the rack. Again, not completely optimal but very stable when nobody screws up their switch priority while adding topology...

jtainio · May 17, 2022

Great

george said:
I also got tricked in to upgrading and NVidia GPU stopped working with the new 5.15 kernel. 5.13 works.

I had similar problems, Asus X99-E motherboard, Nvidia RTX2080ti and 3070 graphics cards, everything worked fine, until I restarted the host after the upgrade and dmesg log was filled with:
[ 1953.114004] vfio-pci 0000:05:00.0: BAR 3: can't reserve [mem 0xb0000000-0xb1ffffff 64bit pref]

Tried to shuffle the GPU:s around on the motherboard after reading that the PCIe slot 1 can be problematic, but no help.
After many hours of torment, booting the host with 5.13.19-6-pve -kernel helped. Booting with both 5.15.30-2-pve and 5.15.35-1-pve -kernels result in the error mentioned above.

need2gcm · May 17, 2022

jtainio said:
Great

I had similar problems, Asus X99-E motherboard, Nvidia RTX2080ti and 3070 graphics cards, everything worked fine, until I restarted the host after the upgrade and dmesg log was filled with:
[ 1953.114004] vfio-pci 0000:05:00.0: BAR 3: can't reserve [mem 0xb0000000-0xb1ffffff 64bit pref]

Tried to shuffle the GPU:s around on the motherboard after reading that the PCIe slot 1 can be problematic, but no help.
After many hours of torment, booting the host with 5.13.19-6-pve -kernel helped. Booting with both 5.15.30-2-pve and 5.15.35-1-pve -kernels result in the error mentioned above.

For you and anyone else running into this, I was able to get my home server working with Nvidia passthrough again on the new kernel using the script found in this thread and the other changes I mention:

https://forum.proxmox.com/threads/gpu-passthrough-issues-after-upgrade-to-7-2.109051/post-471546

Dunuin · May 19, 2022

t.lamprecht said:
Maybe we could do this frontend side only, i.e., make the notes' field from the manual backup one stateful, saving the last used one in the browser local storage for reuse, would allow avoiding yet another config option.

PVE upgrade asked me to replace my vzdump.conf wih a new version. New config file got a new line #notes-template: {{node}}. Commented it out and replaced it with some custom text and manual backups now indeed got my custom text as a default.
So thanks for the fast addition

sherminator · May 19, 2022

tjk said:
Anyone NOT having an issue with 7.2 release with enterprise subscription kernel?

Hey,

we had no issues upgrading our three node PVE/Ceph cluster to 7.2 (except for some minor dependency foo during the upgrade process). Our systems based on Supermicro mainboards with AMD EPYC CPUs.

Greets
Stephan

gseeley · May 20, 2022

need2gcm said:
For you and anyone else running into this, I was able to get my home server working with Nvidia passthrough again on the new kernel using the script found in this thread and the other changes I mention:

https://forum.proxmox.com/threads/gpu-passthrough-issues-after-upgrade-to-7-2.109051/post-471546

I tried this out yesterday and while it did allow a Windows 10 VM to boot with an nVidia Quadro card on the latest kernel 5.15.35-3, I had the VM crash two times:


May 19 12:03:07 pve-x10sdv QEMU[24492]: KVM: entry failed, hardware error 0x80000021
May 19 12:03:07 pve-x10sdv QEMU[24492]: If you're running a guest on an Intel machine without unrestricted mode
May 19 12:03:07 pve-x10sdv QEMU[24492]: support, the failure can be most likely due to the guest entering an invalid
May 19 12:03:07 pve-x10sdv QEMU[24492]: state for Intel VT. For example, the guest maybe running in big real mode
May 19 12:03:07 pve-x10sdv QEMU[24492]: which is not supported on less recent Intel processors.
May 19 12:03:07 pve-x10sdv kernel: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 19 12:03:07 pve-x10sdv QEMU[24492]: EAX=00000000 EBX=00000000 ECX=00000000 EDX=dbab0948
May 19 12:03:07 pve-x10sdv QEMU[24492]: ESI=d04b0320 EDI=d4742b00 EBP=65b71a40 ESP=6db7efb0
May 19 12:03:07 pve-x10sdv QEMU[24492]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May 19 12:03:07 pve-x10sdv QEMU[24492]: ES =0000 00000000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: CS =be00 7ffbe000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: SS =0000 00000000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: DS =0000 00000000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: FS =0000 00000000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: GS =0000 00000000 ffffffff 00809300
May 19 12:03:07 pve-x10sdv QEMU[24492]: LDT=0000 00000000 00000000 00000000
May 19 12:03:07 pve-x10sdv QEMU[24492]: TR =0040 6db73000 00000067 00008b00
May 19 12:03:07 pve-x10sdv QEMU[24492]: GDT=     6db74fb0 00000057
May 19 12:03:07 pve-x10sdv QEMU[24492]: IDT=     00000000 00000000
May 19 12:03:07 pve-x10sdv QEMU[24492]: CR0=00050032 CR2=bc3c2a08 CR3=55508002 CR4=00000000
May 19 12:03:07 pve-x10sdv QEMU[24492]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 19 12:03:07 pve-x10sdv QEMU[24492]: DR6=00000000fffe0ff0 DR7=0000000000000400
May 19 12:03:07 pve-x10sdv QEMU[24492]: EFER=0000000000000000
May 19 12:03:07 pve-x10sdv QEMU[24492]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
May 19 12:03:08 pve-x10sdv kernel: fwbr100i0: port 2(tap100i0) entered disabled state
May 19 12:03:08 pve-x10sdv kernel: fwbr100i0: port 2(tap100i0) entered disabled state
May 19 12:03:08 pve-x10sdv kernel: DMAR: DRHD: handling fault status reg 2
May 19 12:03:08 pve-x10sdv kernel: DMAR: [DMA Read NO_PASID] Request device [00:14.0] fault addr 0x8605fb000 [fault reason 0x02] Present bit in context entry is clear
May 19 12:03:10 pve-x10sdv systemd[1]: 100.scope: Succeeded.
May 19 12:03:10 pve-x10sdv systemd[1]: 100.scope: Consumed 9h 40min 16.985s CPU time.
May 19 12:03:10 pve-x10sdv qmeventd[90611]: Starting cleanup for 100
May 19 12:03:10 pve-x10sdv kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
May 19 12:03:10 pve-x10sdv kernel: vmbr0: port 2(fwpr100p0) entered disabled state
May 19 12:03:10 pve-x10sdv kernel: device fwln100i0 left promiscuous mode
May 19 12:03:10 pve-x10sdv kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
May 19 12:03:10 pve-x10sdv kernel: device fwpr100p0 left promiscuous mode
May 19 12:03:10 pve-x10sdv kernel: vmbr0: port 2(fwpr100p0) entered disabled state
May 19 12:03:13 pve-x10sdv kernel: device bond0 left promiscuous mode
May 19 12:03:13 pve-x10sdv kernel: device eno3 left promiscuous mode
May 19 12:03:13 pve-x10sdv kernel: device eno4 left promiscuous mode
May 19 12:03:14 pve-x10sdv qmeventd[90611]: Finished cleanup for 100
...
May 19 12:45:55 pve-x10sdv QEMU[92696]: KVM: entry failed, hardware error 0x80000021
May 19 12:45:55 pve-x10sdv QEMU[92696]: If you're running a guest on an Intel machine without unrestricted mode
May 19 12:45:55 pve-x10sdv QEMU[92696]: support, the failure can be most likely due to the guest entering an invalid
May 19 12:45:55 pve-x10sdv QEMU[92696]: state for Intel VT. For example, the guest maybe running in big real mode
May 19 12:45:55 pve-x10sdv QEMU[92696]: which is not supported on less recent Intel processors.
May 19 12:45:55 pve-x10sdv kernel: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 19 12:45:55 pve-x10sdv QEMU[92696]: EAX=00000026 EBX=00000001 ECX=7a310948 EDX=02000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: ESI=00000000 EDI=7a3109b0 EBP=7a310790 ESP=7a310710
May 19 12:45:55 pve-x10sdv QEMU[92696]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May 19 12:45:55 pve-x10sdv QEMU[92696]: ES =0000 00000000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: CS =ba00 7ffba000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: SS =0000 00000000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: DS =0000 00000000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: FS =0000 00000000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: GS =0000 00000000 ffffffff 00809300
May 19 12:45:55 pve-x10sdv QEMU[92696]: LDT=0000 00000000 000fffff 00000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: TR =0040 3bc86000 00000067 00008b00
May 19 12:45:55 pve-x10sdv QEMU[92696]: GDT=     3bc87fb0 00000057
May 19 12:45:55 pve-x10sdv QEMU[92696]: IDT=     00000000 00000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: CR0=00050032 CR2=c1a57780 CR3=001ad002 CR4=00000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: DR6=00000000ffff0ff0 DR7=0000000000000400
May 19 12:45:55 pve-x10sdv QEMU[92696]: EFER=0000000000000000
May 19 12:45:55 pve-x10sdv QEMU[92696]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.
May 19 12:45:56 pve-x10sdv kernel: fwbr100i0: port 2(tap100i0) entered disabled state
May 19 12:45:56 pve-x10sdv kernel: fwbr100i0: port 2(tap100i0) entered disabled state
May 19 12:45:56 pve-x10sdv qmeventd[98673]: Starting cleanup for 100
May 19 12:45:56 pve-x10sdv kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
May 19 12:45:56 pve-x10sdv kernel: vmbr0: port 2(fwpr100p0) entered disabled state
May 19 12:45:56 pve-x10sdv kernel: device fwln100i0 left promiscuous mode
May 19 12:45:56 pve-x10sdv kernel: fwbr100i0: port 1(fwln100i0) entered disabled state
May 19 12:45:56 pve-x10sdv kernel: device fwpr100p0 left promiscuous mode
May 19 12:45:56 pve-x10sdv kernel: vmbr0: port 2(fwpr100p0) entered disabled state
May 19 12:45:57 pve-x10sdv systemd[1]: 100.scope: Succeeded.
May 19 12:45:57 pve-x10sdv systemd[1]: 100.scope: Consumed 51min 55.888s CPU time.
May 19 12:46:01 pve-x10sdv kernel: device bond0 left promiscuous mode
May 19 12:46:01 pve-x10sdv kernel: device eno3 left promiscuous mode
May 19 12:46:01 pve-x10sdv kernel: device eno4 left promiscuous mode
May 19 12:46:01 pve-x10sdv qmeventd[98673]: Finished cleanup for 100

At this point I pinned back the older 5.13.x kernel and the VM worked fine the rest of the day without crash.

~~On another host the latest 5.15.35-3 kernel shows "0" in the KSM Sharing column now when the previous kernel (5.15) was showing a value so I'm not sure what is going on with these kernel builds.~~ KSM Sharing only starts at 80% RAM usage and I had tuned some VM min/max settings and obviously got this below 80% so this isn't an issue.

MrPete · May 23, 2022

t.lamprecht said:
Besides the fact that we refer to the b(eta) version in the release notes: Why does that matter? ... the new rewrite, while maybe marked as beta, worked in all our tests on quite some hosts from DDR3 to DDR5 and virtual, which the old version couldn't even do...

That's VERY good to hear.

I certainly agree that the new version is hugely improved. It caught memory issues for me that were simply invisible to older versions.

I just was cautious about it being offered without a bit of caution since memory tests are depended on as a solid confidence builder in a host machine. The fact that you've done extensive testing is very very encouraging.

Thanks!

ITT · May 23, 2022

tjk said:
Anyone NOT having an issue with 7.2 release with enterprise subscription kernel?

Some minor issues:

1.) Backup freezes VMs with PBS on Ceph cluster -> FIX : upgrading to Test-Repo pve-qemu-kvm
2.) Intel NICs X550-T2 Firmware was too old now for Kernel 5.15 -> FIX: upgrading the Firmware

So, all went good for now.

hvisage · May 25, 2022

martin said:
FAQ
Q: Can I upgrade Proxmox VE 7.0 or 7.1 to 7.2 via GUI?
A: Yes.

I need to add the correction there:
*IF* you use OpenV-Switch *NO*
You must be doing that from a console, and expect to reboot as that is the easier way to get the system and the VM/LXCs connected back to the OpenVswitch bridge(s).
Sorry, I get bitten by this every time when I tried this remotely, and have been watching CAREFULLY to NOT do it via the GUI whenever there are an openvswitch package update.

Rainerle · Jun 2, 2022

We still miss "Maintenance Mode" coming from VMware with a shared storage.

There is nothing visible like that on the Roadmap. Is something like that planned?

Example:
- Proxmox Node needs hardware maintenance
- Instead of Shutdown klick "Maintenance Mode"
- VMs are migrated away to other cluster members where they fit
- CTs are shutdown
- As soon as Node is emptied do whatever
- Node comes online again and is still in "Maintenance Mode"
- Click "Leave Maintenance Mode"
- Manually migrate VMs back.

t.lamprecht · Jun 2, 2022

Already exists for HA managed guests:
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_shutdown_policy

And for non-ha managed ones it's planned to get in the long term, it may well fit into the dynamic resource scheduling/balancing, for which there's some experimenting going on currently.

Zerstoiber · Jun 2, 2022

t.lamprecht said:
Already exists for HA managed guests:
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_shutdown_policy

And for non-ha managed ones it's planned to get in the long term, it may well fit into the dynamic resource scheduling/balancing, for which there's some experimenting going on currently.

But that is explicitly only working on a host shutdown.

Being able to put hosts into maintenance mode would be nice for actions where you actually have to do stuff on the (running!) system that might result in problems or unforseeable disruption (driver updates, firmware updates, working on networking equipment, and so on).
When VMs are migrated away, this work can safely be done - if we cause a problem during maintenance, it should not cause problems for VMs running somewhere else (and then it should also not matter for things like CEPH, because of its redundancy).

Code:

Reboot
Node reboots are initiated with the reboot command. This is usually done after installing a new kernel. 
Please note that this is different from “shutdown”, because the node immediately starts again.

The LRM tells the CRM that it wants to restart, and waits until the CRM puts all resources into the freeze state (same mechanism is used for [URL='https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_package_updates']Package Updates[/URL]). 
This prevents those resources from being moved to other nodes. Instead, the CRM starts the resources after the reboot on the same node.

I don't understand the reasoning - if we have HA available, why would it be preferred to disrupt services by putting VMs in freeze state?
Please make that configurable!

t.lamprecht · Jun 3, 2022

Zerstoiber said:
Being able to put hosts into maintenance mode would be nice for actions where you actually have to do stuff on the (running!) system that might result in problems or unforseeable disruption (driver updates, firmware updates, working on networking equipment, and so on).

Sure, will also be more easily able to do when:

t.lamprecht said:
And for non-ha managed ones it's planned to get in the long term, it may well fit into the dynamic resource scheduling/balancing, for which there's some experimenting going on currently.

bears fruits. As a bulk migrate to a single target is already possible, but naturally that's not what one wants (target overload) and a nuisance to migrate them after the deed is done manually back, so a balancing scheduler is required for this to make sense.
For VMs with local resources, that cannot be live-migrated (or migrated at all) the idea is to offer the possibility that one could also suspend them locally, if wished due to the action being a reboot or the like.

Zerstoiber said:
I don't understand the reasoning - if we have HA available, why would it be preferred to disrupt services by putting VMs in freeze state?
Please make that configurable!

It is. Set the shutdown policy to migrate and it will happen for both reboot and shutdown. The section you quoted describes what's happening if the conditional is configured, could be structured or hinted a bit more explicit there.

Proxmox VE 7.2 released!

Member

Renowned Member

Active Member

Proxmox Staff Member

Member

Member

Active Member

Member

Member

Member

Distinguished Member

Renowned Member

Active Member

Active Member

Renowned Member

Renowned Member

Renowned Member

Proxmox Staff Member

Member

Proxmox Staff Member

We value your privacy