Our Proxmox nodes are consistently failing to start VMs with GPU passthrough after those have run once or twice, presumably due to memory fragmentation. They consistently start successfully if we launch them via the command line, but they always time out when launched via the GUI. I think the VMs take a wee bit longer to start than the computed timeout due to fragmentation.
We cannot use hugepages on our systems. Also, the hypervisors on the affected nodes are older and quite limited on RAM (64 GB total) while the VMs use 10-32 GB each, so the problem is very apparent even though only one VM is typically active at a time.
I found a few related posts on this forum, but no actual solution. I tried to hack `vm_start_nolock` in `/usr/share/perl5/PVE/QemuServer.pm`, setting a large fixed timeout, but it doesn't seem to have an effect, so either there are other timeouts elsewhere, or I am not applying the changes correctly beyond simply saving the Perl scripts.
In posts from a few years ago I found references to timeout options in the GUI, but those no longer seem to exist in current versions of Proxmox. The only workaround I could find was to append `-timeout=X` to the VM's launch command line, but we have dozens of VMs, and more are added weekly, so updating every single one of them by hand isn't really a good option (although it became substantially less painful since we started using VM templates).
What is the recommend way to globally increase timeouts when launching via the GUI?
We cannot use hugepages on our systems. Also, the hypervisors on the affected nodes are older and quite limited on RAM (64 GB total) while the VMs use 10-32 GB each, so the problem is very apparent even though only one VM is typically active at a time.
I found a few related posts on this forum, but no actual solution. I tried to hack `vm_start_nolock` in `/usr/share/perl5/PVE/QemuServer.pm`, setting a large fixed timeout, but it doesn't seem to have an effect, so either there are other timeouts elsewhere, or I am not applying the changes correctly beyond simply saving the Perl scripts.
In posts from a few years ago I found references to timeout options in the GUI, but those no longer seem to exist in current versions of Proxmox. The only workaround I could find was to append `-timeout=X` to the VM's launch command line, but we have dozens of VMs, and more are added weekly, so updating every single one of them by hand isn't really a good option (although it became substantially less painful since we started using VM templates).
What is the recommend way to globally increase timeouts when launching via the GUI?