Proxmox VE 7.2 released!

Zerstoiber · Jun 5, 2022

t.lamprecht said:
It is. Set the shutdown policy to migrate and it will happen for both reboot and shutdown. The section you quoted describes what's happening if the conditional is configured, could be structured or hinted a bit more explicit there.

Thomas, sorry, you are right!
In my testlab (configured for "migrate"), shutdown performed the migration as expected.

I just reviewed the lab environment for why it did not work on restart of a node.
Turns out I did some testing in between, cloned some VMs and forgot to add them to HA! So those did not get migrated later, but frozen instead.

Coming from VMware, some things require a bit more learning - in vCenter, you don't have to explicitly enable HA for each VM. It's the other way around, it lets you configure overrides to not perform actions when HA/DRS is globally enabled.

hookas · Jun 9, 2022

After upgrade to 7.2 we have many kernel panic for KVM linux machines (various OS From Debian 7 to Debian 11 latest with qemu agents or without) when Live migration performed from one to other host in datacenter all directions.

E.g. 2-3 VM crashed with kernel panic from 5 migrated randomly and for safe migrations (for data lost prevention in damaged file systems) we must shutdown VM before migrations.

We didn't has any problems with Live migration in KVM virtualization type before upgrade to VE 7.2
Is any solution and maybe Proxmx team know about kernel panic problems with KVM VM Live migrations?

We use ZFS volumes rpool in all hosts

Our hosts:
Host 1:
16 x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2 Sockets)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 2:
4 x Intel(R) Xeon(R) CPU E3-1220 v3 @ 3.10GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 3:
16 x Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

Host 4:
12 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz (1 Socket)
Kernel Version Linux 5.15.35-1-pve #1 SMP PVE 5.15.35-3
PVE Manager Version 7.2-4

aaron · Jun 9, 2022

hookas said:
After upgrade to 7.2 we have many kernel panic for KVM linux machines (various OS From Debian 7 to Debian 11 latest with qemu agents or without) when Live migration performed from one to other host in datacenter all directions.

There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool ke rnel pin" command in order to not have to select the kernel on each reboot.

hookas · Jun 9, 2022

aaron said:
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool ke rnel pin" command in order to not have to select the kernel on each reboot.

I hope this bug will be fixed soon. Yes with older kernel problem gone...

lDemoNl · Jun 14, 2022

Hi! In upgrade process to v7.2, looks like it is corrupted. Because node gone in offline mod and on display I can see block device error (screenshot added) But after rebooting node starts normally. I dont know how does it affect on Proxmox

guzi · Jun 15, 2022

There seems to be a bug with the new version in the multipath handling on fiber channel LUN's (detected on Cisco Blades / VIC FCoE HBA).
In some circumstances the fiber channel connections are getting lost.
kernel: [2683802.439925] sd 1:0:1:7: Power-on or device reset occurred
kernel: [2683854.653599] device-mapper: multipath: 253:56: Reinstating path 65:144.
kernel: [2683854.653807] sd 2:0:0:4: Power-on or device reset occurred
kernel: [2683854.654398] sd 2:0:0:4: emc: ALUA failover mode detected
kernel: [2683854.654673] sd 2:0:0:4: emc: Found valid sense data 0x 5, 0x24, 0x 0 while sending CLARiiON trespass command.
kernel: [2683854.654746] sd 2:0:0:4: emc: at SP A Port 5 (bound, default SP B)
kernel: [2683854.654749] device-mapper: multipath: 253:56: Failing path 65:144.
The multipathd is then switching around to find another way to keep the LUN up, but sometimes this is not working and the server looses all connections to the storage.

The workaround to start kernel 5.13 via the grub boot menu seems to fix the issue, but 5.15 is not ready in terms of multipath FC usage.

ThaFacialHair · Jun 27, 2022

Just a heads up be careful installing this and using a VGA monitor. Thanks to the update to the new Kernel when you try to install it will give a "out of range" or similar warning. Only work around so far is to install 7.1 and update from that

Guy · Jun 28, 2022

Since upgrading to 7.2-5 I'm finding that the mouse pointer on the noVNC console is way off from the local machine. This is making it almost impossible to use the virtual systems.

I have seen this on both windows 7, 10 and Ubuntu desktops.

hookas · Jun 29, 2022

aaron said:
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool ke rnel pin" command in order to not have to select the kernel on each reboot.

Hello, any news and plans about this issue?

team2021 · Jun 30, 2022

Hello, just feedback for those who are concerned about upgrading. On Hetzner AX61-NVMe was upgrade from 7.1 to 7.2 without issue

cosminidis · Jul 5, 2022

After upgrading from 7.2-4 to 7.2-5, I noticed a spike in I/O delay in the CPU charts for every host in the CEPH cluster. Did anyone else notice that? Is there an explanation?

The cluster still works fine, although speed is clearly lower than on local storage.

itNGO · Jul 5, 2022

cosminidis said:
After upgrading from 7.2-4 to 7.2-5, I noticed a spike in I/O delay in the CPU charts for every host in the CEPH cluster. Did anyone else notice that? Is there an explanation?

The cluster still works fine, although speed is clearly lower than on local storage.

Did you measure this somehow? Some Benchmarks? This sounds like it could be a big issue.

cosminidis · Jul 6, 2022

No, I just noticed on the dashboard. IO delay used to be at 0 at 7.2-4 (the initial install), after the upgrade to 7.2-5 I noticed it stays constant at 2-3%.

I haven't upgraded to 7.2-6 yet, I'm wondering if anyone tried it out.

itNGO · Jul 6, 2022

cosminidis said:
No, I just noticed on the dashboard. IO delay used to be at 0 at 7.2-4 (the initial install), after the upgrade to 7.2-5 I noticed it stays constant at 2-3%.

I haven't upgraded to 7.2-6 yet, I'm wondering if anyone tried it out.

View attachment 38714

Maybe you should open a new thread for this and describe it with much more details.
Performance-Issues should be taken seriously ....

ermanishchawla · Jul 17, 2022

t.lamprecht said:
Creation is CLI only for now, see the docs for details and examples:
https://pve.proxmox.com/pve-docs/chapter-pveceph.html#pve_ceph_ec_pools

Hi there are typo errors in the documentation instead of "pveceph" it is written "pceveph" in many places. You may please correct that

t.lamprecht · Jul 17, 2022

ermanishchawla said:
Hi there are typo errors in the documentation instead of "pveceph" it is written "pceveph" in many places. You may please correct that

Thanks for the report, those two occurrences of that typo got already fixed in git since a bit:
https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=81de7382b02c0a78a8a1e83bac6b30b39d4baa6b

But pve-docs didn't got bumped since then, will probably happen soonish and with that we'll also roll it out on the public git.

ermanishchawla · Jul 17, 2022

t.lamprecht said:
Thanks for the report, those two occurrences of that typo got already fixed in git since a bit:
https://git.proxmox.com/?p=pve-docs.git;a=commitdiff;h=81de7382b02c0a78a8a1e83bac6b30b39d4baa6b

But pve-docs didn't got bumped since then, will probably happen soonish and with that we'll also roll it out on the public git.

Thanks checked the git, bump version is pending
Tried erasure code pool, performance is better than RF=3 in benchmark, testing more on load and per VM

hookas · Jul 22, 2022

aaron said:
There seems to be a bug in the Kernel when live migrating from newer to older CPUs. We are currently evaluating if we can back port it (see our bugtracker: https://bugzilla.proxmox.com/show_bug.cgi?id=4073#c27).

The CPUs in your cluster seem to span quite a few generations. As a workaround for the time being, you can use an older Kernel. 5.13 should work fine AFAIK. You can use the "proxmox-boot-tool ke rnel pin" command in order to not have to select the kernel on each reboot.

Is Kernel patch for similar issues, maybe Proxmox developers team will implement this?
https://lore.kernel.org/lkml/CAJ6HWG66HZ7raAa+YK0UOGLF+4O3JnzbZ+a-0j8GNixOhLk9dA@mail.gmail.com/T/

What you think?

t.lamprecht · Jul 22, 2022

hookas said:
Is Kernel patch for similar issues, maybe Proxmox developers team will implement this?
https://lore.kernel.org/lkml/CAJ6HWG66HZ7raAa+YK0UOGLF+4O3JnzbZ+a-0j8GNixOhLk9dA@mail.gmail.com/T/

I already tried to backport that series a while ago, but it doesn't apply cleanly on our 5.15 based tree and the change is too big and in relatively critical sections to consider just making it fit... Or did you manage to do a clean backport and verified that it fixes your issues?

hookas · Jul 22, 2022

t.lamprecht said:
I already tried to backport that series a while ago, but it doesn't apply cleanly on our 5.15 based tree and the change is too big and in relatively critical sections to consider just making it fit... Or did you manage to do a clean backport and verified that it fixes your issues?

No i do not know how to do kernel backports with source code.. just asked about it...

This feature (possibility to migrate without any problems from different CPU generation) was very very unique cool feature, it will be a pity to lose this feature....

with latest kernel 5.15 VM KVM migration still has many problems (crashing). I hope you will find solution for it, because not always servers has identical CPU generations... with 5.13.19-6-pve kernel no issues, everything works without crashing it is best kernel version (5.13.19-6-pve) for proxmox at this moment i think

Proxmox VE 7.2 released!

Member

New Member

Proxmox Staff Member

New Member

Member

Member

New Member

Renowned Member

New Member

Member

Member

Famous Member

Member

Famous Member

Well-Known Member

Proxmox Staff Member

Well-Known Member

New Member

Proxmox Staff Member

New Member

We value your privacy