We use Proxmox to spawn ephemeral virtual machines from templates as linked clones and qemuga to install our software and run e2e tests. We can not use ssh or anything network related since our software meddles with networking and other OS stuff.
When I run a single test (1 vm) everything works fine almost all of the time. The more i try to scale it, the more qga timeouts I have. We don't necessarily run many commands on 1 vm, but rather spawn 5, 10, 20 vm's and execute commands via qga. Some commands I execute should be done in mere milliseconds, even those fail.
No resource seems to be exhausted, enough ram, ssd, cpu power.
I've already increased service daemons and proxy daemons to handle our requests, but it did not make a difference.
Sometimes it works fine, sometimes some tests fail (due to qga timeouts), sometimes almost all fail.
I've manually modified qga timeouts in perl code on my pve host, which did help a little, but I need a completely stable solution.
Now my question is, am I looking in the right direction, before I dive even further in the rabbit hole:
Is this a qga problem? Could vsock be a solution?
Does proxmox not handle many qga execs well? -> resource problem?
Is there another solution that solves my problem?
When I run a single test (1 vm) everything works fine almost all of the time. The more i try to scale it, the more qga timeouts I have. We don't necessarily run many commands on 1 vm, but rather spawn 5, 10, 20 vm's and execute commands via qga. Some commands I execute should be done in mere milliseconds, even those fail.
No resource seems to be exhausted, enough ram, ssd, cpu power.
I've already increased service daemons and proxy daemons to handle our requests, but it did not make a difference.
Sometimes it works fine, sometimes some tests fail (due to qga timeouts), sometimes almost all fail.
I've manually modified qga timeouts in perl code on my pve host, which did help a little, but I need a completely stable solution.
Now my question is, am I looking in the right direction, before I dive even further in the rabbit hole:
Is this a qga problem? Could vsock be a solution?
Does proxmox not handle many qga execs well? -> resource problem?
Is there another solution that solves my problem?
Last edited: