PBS starting up stopped VMs on Backup

Lephisto

Well-Known Member
Jun 22, 2019
154
16
58
47
Hi,

i noticed one weird behaviour:

I have a few VMs in my cluster that are stopped on purpose. HA is set to request state = stopped.

Now when the PBS Backup is running is see this:

INFO: starting kvm to execute backup task

The VM is being booted up, which can cause some trouble.

Proxmox / PBS Version are latest.

Any clarification on this?

thanks,
Meph
 
Hi,
Hi,

i noticed one weird behaviour:

I have a few VMs in my cluster that are stopped on purpose. HA is set to request state = stopped.

Now when the PBS Backup is running is see this:

INFO: starting kvm to execute backup task

The VM is being booted up, which can cause some trouble.
yes, we need to leverage QEMU's block drivers to be able to handle all the possible image formats during backup. But the VM is started in a paused state, so how can it cause trouble?
 
Hi,

yes, we need to leverage QEMU's block drivers to be able to handle all the possible image formats during backup. But the VM is started in a paused state, so how can it cause trouble?
I have noticed this behaviour because the telegraf agent inside the VM has started transmitting telemetry to influxdb and when it was shutdown at the end of the backup a deadman alert was raised.

i guess this is not an intended behaviour?
 
Ok, forget about it.

It was not a deadman alert but a qemu-ga alert, which I implemented into our obeservability setup to get informed when Guest-Agents crash somewhere. ( https://github.com/lephisto/check-ga )

It basically issues a qmp guest-ping every 5 minutes. If that happens while the backup is running, a qemu-ga alert is raised because the guest-ping returns 255. I will adjust my script.

sorry for bothering and thanks for clarification about the pbs behaviour.

regards
 
It basically issues a qmp guest-ping every 5 minutes. If that happens while the backup is running, a qemu-ga alert is raised because the guest-ping returns 255. I will adjust my script.
I think you could just check if the VM is currently locked for backup for example.
sorry for bothering and thanks for clarification about the pbs behaviour.
No problem at all. It is a good question, because the behavior is rather unexpected if you are not familiar with the internals. It's not only PBS, but all VM backups.
 
Well, it applies to all backups that rely on qemu snapshot mechanisms. There are backup/replication tools, that utilize ceph or zfs snapshotting and bypass this whole qemu thingie.

thanks again, i will implement additional checks. :)
 
Also very annoying in case you got multiple VMs sharing the same PCIe device, which usually wouldn't be a problem if you only intend to run one of those at once. But if one is already running the backup of all the other stopped VMs will faip0l, as the backup task can't start the VM as this will fail as the PCI device is already in use.

Would it be in theory possible for a backup job to temporarily remove or just ignore those PCI devices when starting a stopped VM for a backup?
When the backup job isn't booting the guestOS it shouldn't matter if the pci device ia missing right?

This then would avoid these annoying failed backup jobs (listed in PBS for 30 days as red numbers) in these esge cases.
 
Hi,
Also very annoying in case you got multiple VMs sharing the same PCIe device, which usually wouldn't be a problem if you only intend to run one of those at once. But if one is already running the backup of all the other stopped VMs will faip0l, as the backup task can't start the VM as this will fail as the PCI device is already in use.

Would it be in theory possible for a backup job to temporarily remove or just ignore those PCI devices when starting a stopped VM for a backup?
When the backup job isn't booting the guestOS it shouldn't matter if the pci device ia missing right?

This then would avoid these annoying failed backup jobs (listed in PBS for 30 days as red numbers) in these esge cases.
it is currently possible to resume a VM that was started paused for backup. We'd need to prevent that for such a change and that is a breaking change. Other people would then complain they need to wait until the backup is finished to start/resume their VM.
 
  • Like
Reactions: Lephisto and Dunuin
Came here looking for an explanation for why PBS was starting my stopped VMs, so thanks for the explanation. The issue I was having was with my Zabbix monitoring because I would place the "not running" triggers into suppressed indefinitely, but it kept being re-triggered and therefore unsuppressed because the VM got started then stopped by PBS. I don't want to remove the trigger as I want to know when the VMs are started so as not to forget to shut them down again, so I'll have to find a way to get Zabbix to detect this behaviour.
 
I monitor my guests by installing the zabbix agent inside the guestOS.
Way more metrics monitorable and as PBS won't boot the guestOS when starting it for the backup, zabbix won't complain that the guest is started.
 
I monitor my guests by installing the zabbix agent inside the guestOS.
Way more metrics monitorable and as PBS won't boot the guestOS when starting it for the backup, zabbix won't complain that the guest is started.
I do that as well, but Zabbix is also monitoring the host which is just as important - the guest won't tell you about the metric of the host
 
I do that as well, but Zabbix is also monitoring the host which is just as important - the guest won't tell you about the metric of the host
I too, but I always disable the guest metrics/triggers of the host plugin. The agent of the guestOS will already monitor uptime + CPU load + RAM utilization so I don't need those again collected by the PVE host. All I care about on the hosts zabbix template are the host specific metrics.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!