Problem with worker VM start initiated by Veeam on PVE

I'm working with Veeam support on this, too. The standard Veeam PVE logs don't really give a clue about what's happening. The supporter asked me to set my log level to debug (in C:\Program Files\Veeam\Plugins\PVE\Service\appsettings.json). I've done that -- now I just need the failing startup of the Veeam worker to happen again, hopefully the logs will give more info.
 
  • Like
Reactions: Teletrend AG
I got info from Veeam support and we can actually leave the Veeam worker just in an idle/powered on state.

C:\Program Files\Veeam\Plugins\PVE\Service\
edit the appsettings.json
Under "Workers" you can change KeepTurnedOn from false to true
"KeepTurnedOn": true

Save and reboot server or Veeam PVE Service.

I am going to try this out for awhile and see how it does. It looks to barely use any cpu resources while idle but will eat up some ram but less risky then having it not power up correctly and miss checkpoints on VMs.

I did test what happens if you reboot the Veeam server with worker powered on, the Veeam worker will stay powered on, once a backup is started it resets the Veeam worker and you can see the uptime clock reset.

Seems good so far, will let it go for the rest of the week and see if it's more stable.
Thanks for sharing this info. I just edited the file - let's see...

And BTW: neweset Veeam B&R 12.3.1.1139 is still having the same issue.
 
Weve used Veeam with ESXi (6.7, 7.0 and 8.0) and its really really troublesome. Switching to PBS for our new PVE cluster made everything 100% better. The backup speed has gone up drastically. Backups just work! In Veeam we have to fiddle with the options every single week and it really takes a toll on us.
So: give PBS a try and never look back :)
 
  • Like
Reactions: Johannes S
I opened a ticket with Veeam, and they demonstrated that they only send a POST API command to start the worker virtual machine. According to Veeam, the issue is on the Proxmox side.
Example : <== Request "Post" "https://pve-02:8006/api2/json/nodes/pve-02/qemu/119/status/start"
This request is accepted, but stucks starting the virtual machine proxmox side.
And then timeout with the message
Failed to prepare the worker pve-veeam-prod-worker02: Failed to power on the worker VM: start failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply.
Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.; W2K11 : An unknown Proxmox VE error has occurred; There are no available
workers in the cluster pve-02. Performance may be affected; Job finished with error at 3/24/2025 6:26:43 PM
 
I opened a ticket with Veeam, and they demonstrated that they only send a POST API command to start the worker virtual machine. According to Veeam, the issue is on the Proxmox side.
Example : <== Request "Post" "https://pve-02:8006/api2/json/nodes/pve-02/qemu/119/status/start"
This request is accepted, but stucks starting the virtual machine proxmox side.
And then timeout with the message
Failed to prepare the worker pve-veeam-prod-worker02: Failed to power on the worker VM: start failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply.
Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.; W2K11 : An unknown Proxmox VE error has occurred; There are no available
workers in the cluster pve-02. Performance may be affected; Job finished with error at 3/24/2025 6:26:43 PM
I'm just wondering why the Veeam worker test never had failed so far (and I've tried it almost every time the backup failed).
 
I got info from Veeam support and we can actually leave the Veeam worker just in an idle/powered on state.

C:\Program Files\Veeam\Plugins\PVE\Service\
edit the appsettings.json
Under "Workers" you can change KeepTurnedOn from false to true
"KeepTurnedOn": true

Save and reboot server or Veeam PVE Service.

I am going to try this out for awhile and see how it does. It looks to barely use any cpu resources while idle but will eat up some ram but less risky then having it not power up correctly and miss checkpoints on VMs.

I did test what happens if you reboot the Veeam server with worker powered on, the Veeam worker will stay powered on, once a backup is started it resets the Veeam worker and you can see the uptime clock reset.

Seems good so far, will let it go for the rest of the week and see if it's more stable.
Hey IT Dude,
Has this solution been working for you? To me, it seems like the only thing I haven't tried yet and could solve the problem. But before I start messing around with the configs, I'd like to ask you if this really solved the issue?
Best Regards,
Enex
 
Another thing might have fixed the issue for me is i noticed i had 2 jobs starting at the same time. Those 2 jobs had VMs on same host so it could be that 2 simultaneously start commands were given to same worker, resulting in hang.
I just noticed that since my change to 4 CPU i also migrated a VM to another host. So now those 2 jobs start at the same time but start different workers.
 

Attachments

  • veeam02.png
    veeam02.png
    4.6 KB · Views: 8
  • veeam01.png
    veeam01.png
    6.8 KB · Views: 8
Last edited:
Hey IT Dude,
Has this solution been working for you? To me, it seems like the only thing I haven't tried yet and could solve the problem. But before I start messing around with the configs, I'd like to ask you if this really solved the issue?
Best Regards,
Enex
Ya so far it works solid. The worker just stays on so it never misses a beat.
Honestly, the config file is just opening it in notepad and changing a false to a true, and hit save. So you won't mess with much.

When I applied the 12.3.1.1139 patch it resets the config as well. So, keep that in mind in the future.
 
Last edited:
Ya so far it works solid. The worker just stays on so it never misses a beat.
Honestly, the config file is just opening it in notepad and changing a false to a true, and hit save. So you won't mess with much.

When I applied the 12.3.1.1139 patch it resets the config as well. So, keep that in mind in the future.
Thank you for the reply. I'll check it out and will update you guys if it'll work for my environments.
 
Sorry, despite the patch, the issue has occurred again. Veeam recommends opening a ticket with Proxmox.
 
Thank you for the reply. I'll check it out and will update you guys if it'll work for my environments.
Update: Since changing the config a few days ago, it seems to work. I have updated before changing the config, therefore I do not know if it would have overwritten it.
 
I will apply the configurations you suggested. As soon as I get some results, I’ll give you feedback.


The worst part about this error is that it doesn’t have a “parent” — sometimes the problem seems to be with Proxmox, other times with Veeam, but no one solves it.
 
I've been using Veeam's PVE plug-in since release and I must say, I've never had an issue with the worker not starting or hanging. It simply works for me. PBS in my environment was way too slow and doesn't meet my needs either. I'd be interested to see if there is a trend in your environments comparatively.
 
Weve used Veeam with ESXi (6.7, 7.0 and 8.0) and its really really troublesome. Switching to PBS for our new PVE cluster made everything 100% better. The backup speed has gone up drastically. Backups just work! In Veeam we have to fiddle with the options every single week and it really takes a toll on us.
So: give PBS a try and never look back :)
Well, can you back up to an immutable S3 storage with PBS? VBR can do it very well!
 
Well, can you back up to an immutable S3 storage with PBS? VBR can do it very well!
You can't at the moment but utilicing pull-sync and the permission system of PBS you can have something simmiliar:
https://pbs.proxmox.com/docs/storage.html#ransomware-protection-recovery

Basically you have one PBS in your network and another one offsite. The offsite PBS is allowed to pull backups from your local PBS, but not to remove or alter anything on it and vice versa. And no data is every edited, just added (and removed if not referenced in any backup snapshot any more). Thus a bad actor wouldn't be able to do anything on your remote PBS even if he manage to take over your local PBS and vice versa. Even better: For doing a pull the remote PBS doesn't even need to allow any incoming connection from your network as long as your local PBS allows incoming connections on port 8007. Since a bad actor on your local PBS can't even connect to the remote one, this is quite a good protection. In case of a restore event you would temporary enable access to the remote PBS, restore everything and afterwards disable access again.

Now this approach obviouvsly doesn't fit every requirement and the remote PBS is still not immutable if some bad actor manage to take over it but for scenario "Shit, we have a hacker in our local infrastructure and can't trust it anymore. We need to wipe everything and rebuild from offsite backups" this might be sufficient.

Another reason to use Veeam is if you need stuff like application-aware backups for ActiveDirectory and MS SQL Server (PBS don't have support for it and propably never will). In the German forum @Falk R. mentioned that some of his customers achieved significant cost savings by switching from an all Veeam setup to PBS + Veeam license just for Veeam Agents inside the MS SQL + AD VMs. PBS takes care of the VMs and Veeam of the application level stuff. Combined in his customers case this was still way cheaper than doing everything with Veeam since less licenses were needed.

So like always: It really depends on your usecase ;)