Proxmox VE 7.2 released!

It is. Set the shutdown policy to migrate and it will happen for both reboot and shutdown. The section you quoted describes what's happening if the conditional is configured, could be structured or hinted a bit more explicit there.

Thomas, sorry, you are right!
In my testlab (configured for "migrate"), shutdown performed the migration as expected.

I just reviewed the lab environment for why it did not work on restart of a node.
Turns out I did some testing in between, cloned some VMs and forgot to add them to HA! So those did not get migrated later, but frozen instead.


Coming from VMware, some things require a bit more learning - in vCenter, you don't have to explicitly enable HA for each VM. It's the other way around, it lets you configure overrides to not perform actions when HA/DRS is globally enabled.
 
Last edited:
After upgrade to 7.2 we have many kernel panic for KVM linux machines (various OS From Debian 7 to Debian 11 latest with qemu agents or without) when Live migration performed from one to other host in datacenter all directions.

E.g. 2-3 VM crashed with kernel panic from 5 migrated randomly and for safe migrations (for data lost prevention in damaged file systems) we must shutdown VM before migrations.

We didn't has any problems with Live migration in KVM virtualization type before upgrade to VE 7.2
Is any solution and maybe Proxmx team know about kernel panic problems with KVM VM Live migrations?

We use ZFS volumes rpool in all hosts

Our hosts:
Host 1:
16 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2 Sockets)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 2:
4 x Intel(R) Xeon(R) CPU E3-1220 v3 @ 3.10GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 3:
16 x Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 4:
12 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4
 
After upgrade to 7.2 we have many kernel panic for KVM linux machines (various OS From Debian 7 to Debian 11 latest with qemu agents or without) when Live migration performed from one to other host in datacenter all directions.
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool kernel pin" command in order to not have to select the kernel on each reboot.
 
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool kernel pin" command in order to not have to select the kernel on each reboot.

I hope this bug will be fixed soon. Yes with older kernel problem gone...
 
Hi! In upgrade process to v7.2, looks like it is corrupted. Because node gone in offline mod and on display I can see block device error (screenshot added) But after rebooting node starts normally. I dont know how does it affect on Proxmox
dmesg.PNG
 
There seems to be a bug with the new version in the multipath handling on fiber channel LUN's (detected on Cisco Blades / VIC FCoE HBA).
In some circumstances the fiber channel connections are getting lost.
kernel: [2683802.439925] sd 1:0:1:7: Power-on or device reset occurred
kernel: [2683854.653599] device-mapper: multipath: 253:56: Reinstating path 65:144.
kernel: [2683854.653807] sd 2:0:0:4: Power-on or device reset occurred
kernel: [2683854.654398] sd 2:0:0:4: emc: ALUA failover mode detected
kernel: [2683854.654673] sd 2:0:0:4: emc: Found valid sense data 0x 5, 0x24, 0x 0 while sending CLARiiON trespass command.
kernel: [2683854.654746] sd 2:0:0:4: emc: at SP A Port 5 (bound, default SP B)
kernel: [2683854.654749] device-mapper: multipath: 253:56: Failing path 65:144.
The multipathd is then switching around to find another way to keep the LUN up, but sometimes this is not working and the server looses all connections to the storage.

The workaround to start kernel 5.13 via the grub boot menu seems to fix the issue, but 5.15 is not ready in terms of multipath FC usage.
 
Just a heads up be careful installing this and using a VGA monitor. Thanks to the update to the new Kernel when you try to install it will give a "out of range" or similar warning. Only work around so far is to install 7.1 and update from that
 
Since upgrading to 7.2-5 I'm finding that the mouse pointer on the noVNC console is way off from the local machine. This is making it almost impossible to use the virtual systems.

I have seen this on both windows 7, 10 and Ubuntu desktops.

1656431771621.png
 
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool kernel pin" command in order to not have to select the kernel on each reboot.

Hello, any news and plans about this issue?
 
  • Like
Reactions: uncle.cripple
After upgrading from 7.2-4 to 7.2-5, I noticed a spike in I/O delay in the CPU charts for every host in the CEPH cluster. Did anyone else notice that? Is there an explanation?

The cluster still works fine, although speed is clearly lower than on local storage.
 
After upgrading from 7.2-4 to 7.2-5, I noticed a spike in I/O delay in the CPU charts for every host in the CEPH cluster. Did anyone else notice that? Is there an explanation?

The cluster still works fine, although speed is clearly lower than on local storage.
Did you measure this somehow? Some Benchmarks? This sounds like it could be a big issue.
 
No, I just noticed on the dashboard. IO delay used to be at 0 at 7.2-4 (the initial install), after the upgrade to 7.2-5 I noticed it stays constant at 2-3%.

I haven't upgraded to 7.2-6 yet, I'm wondering if anyone tried it out.

1657088784408.png
 
No, I just noticed on the dashboard. IO delay used to be at 0 at 7.2-4 (the initial install), after the upgrade to 7.2-5 I noticed it stays constant at 2-3%.

I haven't upgraded to 7.2-6 yet, I'm wondering if anyone tried it out.

View attachment 38714
Maybe you should open a new thread for this and describe it with much more details.
Performance-Issues should be taken seriously ....
 
Thanks for the report, those two occurrences of that typo got already fixed in git since a bit:
https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=81de7382b02c0a78a8a1e83bac6b30b39d4baa6b

But pve-docs didn't got bumped since then, will probably happen soonish and with that we'll also roll it out on the public git.
Thanks checked the git, bump version is pending
Tried erasure code pool, performance is better than RF=3 in benchmark, testing more on load and per VM
 
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool kernel pin" command in order to not have to select the kernel on each reboot.

Is Kernel patch for similar issues, maybe Proxmox developers team will implement this?
https://lore.kernel.org/lkml/CAJ6HWG66HZ7raAa+YK0UOGLF+4O3JnzbZ+a-0j8GNixOhLk9dA@mail.gmail.com/T/

What you think?
 
Is Kernel patch for similar issues, maybe Proxmox developers team will implement this?
https://lore.kernel.org/lkml/CAJ6HWG66HZ7raAa+YK0UOGLF+4O3JnzbZ+a-0j8GNixOhLk9dA@mail.gmail.com/T/
I already tried to backport that series a while ago, but it doesn't apply cleanly on our 5.15 based tree and the change is too big and in relatively critical sections to consider just making it fit... Or did you manage to do a clean backport and verified that it fixes your issues?
 
I already tried to backport that series a while ago, but it doesn't apply cleanly on our 5.15 based tree and the change is too big and in relatively critical sections to consider just making it fit... Or did you manage to do a clean backport and verified that it fixes your issues?

No i do not know how to do kernel backports with source code.. just asked about it... ;) This feature (possibility to migrate without any problems from different CPU generation) was very very unique cool feature, it will be a pity to lose this feature.... :( with latest kernel 5.15 VM KVM migration still has many problems (crashing). I hope you will find solution for it, because not always servers has identical CPU generations... with 5.13.19-6-pve kernel no issues, everything works without crashing it is best kernel version (5.13.19-6-pve) for proxmox at this moment i think
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!