HA with PCIe host passthrough

Denary

New Member
Sep 21, 2024
3
0
1
I run a frigate CCTV system on my two node cluster with high availability enabled. Obviously a live migration cannot take place and it's not the main issue I have.

Node Failure
In the event of a node failure, everything works perfectly. The VM migrates from the first to the second node using the mapping for the CoralTPU device on the second node. Perfect!

Node Maintenance
If I set the node maintenance flag 'ha-manager crm-command node-maintenance enable {node}' or if I have the reboot action set to 'migrate'; the VM tries to migrate live. Obviously with a PCIe device the migration action fails but then ha-manager just retries and gets stuck in a loop. I have to disable HA manually and migrate it myself.
I also found another weird bug where deleting the VM out of the HA manager left the node in a weird state where it's stopped most of it's pve/linux processes including ssh but left the vm up.

I feel like the migration action getting stuck is a bug... Is there a way to flag a virtual machine within ha to say "this virtual machine cannot live migrate. Shut it down, migrate, spin back up". Or if not is there a suggestion box somewhere?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!