Obviously when its being used in the the VM I understand and accept the LXCs wont work.
Aplogies if this is a dumb question. But i just tested something that I suspected wouldnt work but I wanted to see what would happen. (I dont really know how linux kernel and modules and drivers etc all work under the hood, Im pretty uneducated in these matters but have enough curiosity to get myself into trouble.
I created a VM with the GPU fully PCIE passed through to it, while all the LXCs were actively using the GPU. and tried to start this new VM, as expected the VM failed to start and hung in the 'starting' state. I wanted to see if PCIE passing an entire raw hardware device magically took precedence over the LXC bindings. I kinda hoped the VM would start up and the LXC's would freak out and break as they lost the GPU, which meant to get everythign back I could just shutdown the VM and reboot the LXCs, but that didnt happen. I entertained that impossibly slim sliver of hope as my understanding is that unpriviledged LXCs dont have direct root level access to hardware or kernel so cant 'force hold-on' to hardware but clearly proxmox is much smarter than me and wouldnt let that happen.
The reason I am interested in this situation is that I have an RTX GPU which I am mounting into several LXCs for CUDA use and transcoding purposes. Everything is working great currently. openwebui/TTS/ik_llama.cpp, jellyfin and nextcloud all are using the Nvidia GPU succesfully for their respective tasks.
BUT
Eventually this proxmox host is getting additional GPUs installed and I am thinking at some point I'd like to be able to 'pop-out' one of those additional GPU's from the LXCs and hand it over to a windows VM temporarily on a regular basis.
I tried shutting down all the LXCs that the GPU is mounted into then starting up the VM that has that GPU passed through to it. But that also doesnt work as of course proxmox is using the GPU, I can see the PVE cli over the GPU hdmi on a monitor and of course its handling passing all the /dev/nvidia* to the LXCs. My thought was maybe unload the relevant drivers/modules but then in a multi-gpu situation that would kill all GPUs not just the one for VM.
When I eventually have multiple GPUs installed it would be pretty useful to have the LXC and proxmox just 'lose' one of the GPUs and carry on while the lost GPU gets reassigned over to the VM and then when the VM shuts down, GPU can be reposessed by proxmox and LXCs.
Ive come across a few old posts discussing /sys/bus/pci unbind and rescan etc and I figure the solution has to be somehwere along those lines. But this is all still a fair bit above my pay grade.
Is this even feasible and if so is there any elegant way to do this? Would really appreciate some pointers
I just found a succesful attempt from a user in archlinux back in 2013 but that was a 1xGPU system so if I attempted to recreate their solution for proxmox I'd end up losing all GPUs?
Aplogies if this is a dumb question. But i just tested something that I suspected wouldnt work but I wanted to see what would happen. (I dont really know how linux kernel and modules and drivers etc all work under the hood, Im pretty uneducated in these matters but have enough curiosity to get myself into trouble.
I created a VM with the GPU fully PCIE passed through to it, while all the LXCs were actively using the GPU. and tried to start this new VM, as expected the VM failed to start and hung in the 'starting' state. I wanted to see if PCIE passing an entire raw hardware device magically took precedence over the LXC bindings. I kinda hoped the VM would start up and the LXC's would freak out and break as they lost the GPU, which meant to get everythign back I could just shutdown the VM and reboot the LXCs, but that didnt happen. I entertained that impossibly slim sliver of hope as my understanding is that unpriviledged LXCs dont have direct root level access to hardware or kernel so cant 'force hold-on' to hardware but clearly proxmox is much smarter than me and wouldnt let that happen.
The reason I am interested in this situation is that I have an RTX GPU which I am mounting into several LXCs for CUDA use and transcoding purposes. Everything is working great currently. openwebui/TTS/ik_llama.cpp, jellyfin and nextcloud all are using the Nvidia GPU succesfully for their respective tasks.
BUT
Eventually this proxmox host is getting additional GPUs installed and I am thinking at some point I'd like to be able to 'pop-out' one of those additional GPU's from the LXCs and hand it over to a windows VM temporarily on a regular basis.
I tried shutting down all the LXCs that the GPU is mounted into then starting up the VM that has that GPU passed through to it. But that also doesnt work as of course proxmox is using the GPU, I can see the PVE cli over the GPU hdmi on a monitor and of course its handling passing all the /dev/nvidia* to the LXCs. My thought was maybe unload the relevant drivers/modules but then in a multi-gpu situation that would kill all GPUs not just the one for VM.
When I eventually have multiple GPUs installed it would be pretty useful to have the LXC and proxmox just 'lose' one of the GPUs and carry on while the lost GPU gets reassigned over to the VM and then when the VM shuts down, GPU can be reposessed by proxmox and LXCs.
Ive come across a few old posts discussing /sys/bus/pci unbind and rescan etc and I figure the solution has to be somehwere along those lines. But this is all still a fair bit above my pay grade.
Is this even feasible and if so is there any elegant way to do this? Would really appreciate some pointers
I just found a succesful attempt from a user in archlinux back in 2013 but that was a 1xGPU system so if I attempted to recreate their solution for proxmox I'd end up losing all GPUs?
Last edited: