PCI passthrough allows you to use a physical PCI device (graphics card, network card) inside a VM (KVM virtualization only).
If you "PCI passthrough" a device, the device is not available to the host anymore. Note that VMs with passed-through devices cannot be migrated.
is that planned or just aspirational?but we have to do some adaptions of the code for that
(This is 100% personal opinion, so sorry in advance for exagerations and strong words)is that planned or just aspirational?
I have worked in development on operating systems, on features you have used every day if you use windows, hyper-v and remoting, so yes i have an incredibly good idea of the amount of work, not to mention reliance on upstream features.You realize, that this isn't a 10 minute exercise?
Now i have actually implemented a multi node pool i have some more observations /asks / questions:offline migration should work without problems
Well I've been there and wanted that, too.I have worked in development on operating systems, on features you have used every day if you use windows, hyper-v and remoting, so yes i have an incredibly good idea of the amount of work, not to mention reliance on upstream features.
I asked because i want to plan my strategy of what i put where, and the feature is evidently something the devs have noodled on and I just wanted to get a sense of how they think about it, but thanks for replying for them instead.
we want to implement it yes, but no timetable yet, as this touches quite a few places in our vm management logicis that planned or just aspirational?
no currently there is no such features for vms at this point (container have such a 'restart' migration, vms do not yet. feel free to open an enhancement request here: https://bugzilla.proxmox.com )Feature Request: when clicking migrate on an online machine in the UI - rather than blocking the UI from doing a migrate, offer to do an offline migration (aka shutdown machine, move, restart). I assume if I use command line to do the move i won't be blocked like in the UI?
if you mean a node loses connection to the rest of the cluster with 'ha event' then no this cannot work. if a node loses connection, the node will fence itself and the vm will simply be restarted on another node (according to the ha group settings)
- I am unclear what happens in a HA event - will the machine migrate to a running node, startup and just pick first device?
actually no the the memory state etc. is handled by qemu and the driver so that is not the problem.Given proxmox understands the pool, and the first device is picked, i assume the worry in a live migrations are:
- memory state of the vGPU being invalidate as the target VFs will not have state and need to be reinitialized?
- whether the guestOS sees a hotplug event or not
- whether the guestOS driver can cope with that (on windows given the driver model my hope would be it would treated (at worst) as any other video driver crash and re-initialize?
- I am sure I am missing something too..?
does not really have anything to do with hotplug, but yes as i wrote above, it's also a ux problem
- I guess for PCIE pass through there would be a need to treat devices either as hot-plug or not (i,e, some devices would be fine to be migrated live and some will not - and there would need to be a UI mechanism to classify them.... definitely a complex set of UI features required!)
yes, i implemented that because it's super convenient, glad to hear it works as intended
- I love that the VM on start will just pick the first available VFs - thats super neat!
Thanks, that’s what I meant I should have said failover not migrate. Thanks!vm will simply be restarted on another node
Ahh I love these dilemmas, no right answer… unless you are doing a class based allow list where you hard code what device classes are and are not migrateable i think you are only left with defaulting to all are non-migrateable and then admin declares what would be migrateable. Having a list the proxmox team would have to keep accurately populated seem to be an exercise in ‘whack-a-mole’ where the list is constantly updated - but i dont know enough about how pcie devices declare themselves to know (or what linux kernel driver constructs might allow that - I am more of a windows guy…).(e.g. how do we detect if a vgpu is live migratable? do we simple let the user determine that and fail if it does not work?)
yeah if the kernel/device/driver exposes some flag that would be very nice, but i haven't looked into that yet (too much to do, too little time). in any case i want it to be intuitive and mostly friction-less (so it should not be too much work, but you shouldn't be able to wrongly configure it either if possible...)Ahh I love these dilemmas, no right answer… unless you are doing a class based allow list where you hard code what device classes are and are not migrateable i think you are only left with defaulting to all are non-migrateable and then admin declares what would be migrateable. Having a list the proxmox team would have to keep accurately populated seem to be an exercise in ‘whack-a-mole’ where the list is constantly updated - but i dont know enough about how pcie devices declare themselves to know (or what linux kernel driver constructs might allow that - I am more of a windows guy…).
indeed as few 'hangnails' as possible is a good mantrabut you shouldn't be able to wrongly configure it either if possible.