Conflicting maxfiles policy and retention period

velocity08

Active Member
May 25, 2019
246
15
38
47
Hi Team

have a strange issue with a backup policy.

Round 1.
we previously had a backup policy to take a backup once per week and keep 1 backup.

As things change we needed to update this policy to accommodate for more frequent backups.

Round 2.
The old policy was removed and a new policy created.
the new policy was nightly backups
maxfiles 7 (7 days of backups)

this was all done in the GUI.

We could see some VM's had taken up the new 7 day policy with nightly backups.
A few other VM's went to nightly backups but only keep 1 copy of the backup.

Is there a ways we can fix the VM's vzdump config to adhere to the new nightly backups and 7 day maxfiles policy?

i'm not able to locate any per VM vzdump confg files so wondering how this issue has happened?

any assistance would be greatly appreciated.

This is also interesting that we are on PVE 6.0 but when logging in via SSH we see the following:

Code:
Linux pve 5.0.21-3-pve #1 SMP PVE 5.0.21-7 (Mon, 30 Sep 2019 09:11:02 +0200) x86

but in PVE its:

Code:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)

any ideas?

""Cheers
G
 
AFAIR the number of max backups is set for each storage. Do you back up all VMs to the same storage or to different ones which might have a different max backup setting?

This is also interesting that we are on PVE 6.0 but when logging in via SSH we see the following:
Linux pve 5.0.21-3-pve #1 SMP PVE 5.0.21-7 (Mon, 30 Sep 2019 09:11:02 +0200) x86
but in PVE its:
proxmox-ve: 6.0-2 (running kernel: 5.0.21-3-pve)
pve-manager: 6.0-9 (running version: 6.0-9/508dcee0)

any ideas?

The first is the running kernel, the second one the version of PVE packages.

You might want to install updates. Those versions are a bit old by now.
 
AFAIR the number of max backups is set for each storage. Do you back up all VMs to the same storage or to different ones which might have a different max backup setting?



The first is the running kernel, the second one the version of PVE packages.

You might want to install updates. Those versions are a bit old by now.
Hi @narrateourale

updates have now been installed :)

Yes all same storage which is why its strange behavior.

Are there any places that could have a per VM backup config that i can look at ?

very strange and really need to get it sorted out :)

""Cheers
G
 
Hi,
could you share your /etc/pve/storage.cfg,/etc/vzdump.conf and the backup logs, both for a VM where the maxfiles setting worked and a VM where it didn't work? Is there any obvious difference between the VMs where it worked and where it didn't work?
 
  • Like
Reactions: velocity08
Hi,
could you share your /etc/pve/storage.cfg,/etc/vzdump.conf and the backup logs, both for a VM where the maxfiles setting worked and a VM where it didn't work? Is there any obvious difference between the VMs where it worked and where it didn't work?
Hi @fabian

please se ebelow output.
Code:
~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content backup,vztmpl,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

cifs: backup
        path /mnt/pve/backup
        server xxxxxx
        share pve_backup
        content backup
        maxfiles 7
        username pve


~# cat /etc/vzdump.conf
# vzdump default settings

#tmpdir: DIR
#dumpdir: DIR
#storage: STORAGE_ID
#mode: snapshot|suspend|stop
#bwlimit: KBPS
#ionice: PRI
#lockwait: MINUTES
#stopwait: MINUTES
#size: MB
#stdexcludes: BOOLEAN
#mailto: ADDRESSLIST
#maxfiles: N
#script: FILENAME
#exclude-path: PATHLIST
#pigz: N

can you please point me to the backup logs for the VM, are these the ones you can see when the backup is running in the UI?

""Cheers
G
 
can you please point me to the backup logs for the VM, are these the ones you can see when the backup is running in the UI?

There should be log files in the dump directory of your backup storage. You can also check the task log in the GUI (bottom panel) and double click on a given backup task.
 
There should be log files in the dump directory of your backup storage. You can also check the task log in the GUI (bottom panel) and double click on a given backup task.

those logs have been checked and nothing that jumps out at me.

the backup command simply executed the number of VM’s to be backed up in 1 command this finished normally.

then a second command it triggered to clean up and apply the retention period.

so the clean up command must be running a config script somewhere.

where would the clean up config be, would this be different for every VM?

“”Cheers
G
 
The cleanup happens directly after the backup, in the same task and should take the maxfiles argument from the storage config. You can also specify it when doing the API call and override the one from the storage, but you'd have to do that manually. Just to make sure, the backups are configured via the GUI in Datacenter>Backup?

If you share the logs, I'll have an easier time trying to reproduce the issue. Also, what's the output of ls -a run inside your dump directory?
 
The cleanup happens directly after the backup, in the same task and should take the maxfiles argument from the storage config. You can also specify it when doing the API call and override the one from the storage, but you'd have to do that manually. Just to make sure, the backups are configured via the GUI in Datacenter>Backup?

If you share the logs, I'll have an easier time trying to reproduce the issue. Also, what's the output of ls -a run inside your dump directory?

@Fabian_E yes via the GUI correct.

see attached text doc as there where too many characters to paste it into the post.

""Cheers
G
 

Attachments

  • 20200311_proxmox_backups_issue.txt
    11.5 KB · Views: 4
Last edited:
I looked through the file you attached and don't see anything out of the ordinary. The backup task will log a message
Code:
delete old backup <filename>
when it deletes a file, but that doesn't show up in the log of the backup for VM 104 you provided. Are there any such lines in the task log for the bulk backup job with multiple VMs (the one where you posted the first few lines in the attachedment)? If so, could you provide that full log as an attachment?

How old are the VMs for which there are too few backups? Be aware, that if you delete a VM with the Purge option, it will also delete the backups. Also could you post your /etc/pve/vzdump.cron?

EDIT: Purge only removes the VM from the backup jobs, not all the backups.
 
Last edited:
Hi @Fabian_E can you please provide some clear guidence where to grab these logs from?

task log for the bulk backup job with multiple VMs

when looking in the GUI > PVE > Tasks > this only shows the tasks i've manually run no cron tasks.

below is the output you've requested.

Code:
# cat /etc/pve/vzdump.cron
# cluster wide vzdump cron schedule
# Automatically generated file - do not edit

PATH="/usr/sbin:/usr/bin:/sbin:/bin"

45 3 * * 6           root vzdump 100 101 102 103 104 105 106 107 108 110 111 112 113 114 115 117 118 119 120 121 122 --mailto myname@velocityhost.com.au --storage backup --mailnotification always --quiet 1 --node pve --compress lzo --mode snapshot


""Cheers
G
 
On the node where the backups are done, you can do
Code:
cd /var/log/pve/tasks
ls -lt */*vzdump*
to see the log files for the vzdump commands sorted by date.

You could also manually run the job at a convenient time and share that log.
 
On the node where the backups are done, you can do
Code:
cd /var/log/pve/tasks
ls -lt */*vzdump*
to see the log files for the vzdump commands sorted by date.

You could also manually run the job at a convenient time and share that log.

attached.

""Cheers
G
 

Attachments

  • 20200317_proxmox_ls.txt
    25.9 KB · Views: 2
It would be great if you could provide the contents of the logs for the few most recent jobs, i.e.
Code:
-rw-r----- 1 www-data root 148523 Mar 14 05:27 F/UPID:pve:000023B2:035E6D3A:5E6BB88F:vzdump::root@pam:
-rw-r----- 1 www-data root 141212 Mar  7 16:40 7/UPID:pve:000030D2:0002DEA9:5E632017:vzdump::root@pam:
-rw-r----- 1 www-data root   8191 Mar  7 15:09 7/UPID:pve:00002334:0001CBB0:5E631D57:vzdump:104:root@pam:
-rw-r----- 1 www-data root 132961 Mar  7 05:08 E/UPID:pve:0000750E:2767F80B:5E627E0E:vzdump::root@pam:
These are just text files, you could attach them directly. I assume you tried doing a backup of 104 manually on March 7th? Is there still only one file for 104 in the dump directory?

Also I'm a bit confused, since you stated

Hi Team

have a strange issue with a backup policy.

Round 1.
we previously had a backup policy to take a backup once per week and keep 1 backup.

As things change we needed to update this policy to accommodate for more frequent backups.

Round 2.
The old policy was removed and a new policy created.
the new policy was nightly backups
maxfiles 7 (7 days of backups)

but the configuration (quoted below) and list of recent jobs show that the current policy is creating one backup every Saturday and not every day. And from the list of recent jobs it seems that the old policy was one backup daily?

Code:
# cat /etc/pve/vzdump.cron
# cluster wide vzdump cron schedule
# Automatically generated file - do not edit

PATH="/usr/sbin:/usr/bin:/sbin:/bin"

45 3 * * 6           root vzdump 100 101 102 103 104 105 106 107 108 110 111 112 113 114 115 117 118 119 120 121 122 --mailto myname@velocityhost.com.au --storage backup --mailnotification always --quiet 1 --node pve --compress lzo --mode snapshot


""Cheers
G
 
It would be great if you could provide the contents of the logs for the few most recent jobs, i.e.
Code:
-rw-r----- 1 www-data root 148523 Mar 14 05:27 F/UPID:pve:000023B2:035E6D3A:5E6BB88F:vzdump::root@pam:
-rw-r----- 1 www-data root 141212 Mar  7 16:40 7/UPID:pve:000030D2:0002DEA9:5E632017:vzdump::root@pam:
-rw-r----- 1 www-data root   8191 Mar  7 15:09 7/UPID:pve:00002334:0001CBB0:5E631D57:vzdump:104:root@pam:
-rw-r----- 1 www-data root 132961 Mar  7 05:08 E/UPID:pve:0000750E:2767F80B:5E627E0E:vzdump::root@pam:
These are just text files, you could attach them directly. I assume you tried doing a backup of 104 manually on March 7th? Is there still only one file for 104 in the dump directory?

Also I'm a bit confused, since you stated



but the configuration (quoted below) and list of recent jobs show that the current policy is creating one backup every Saturday and not every day. And from the list of recent jobs it seems that the old policy was one backup daily?

Hi @Fabian_E

even if this was the case it should be keeping 7 backups which it is not.

I've cross checked your notes and can confirm that the config in UI is now (as of yesterday) set to all VM's and scheduled to execute daily at 3:45 am.

we should be seeing 7 copies being kept.

at the moment there are Zero copies of VM 104 so its even removing the 1 copy it was keeping previously.

this is really erratic behavour and not acceptable.

need to find a solution for this issue ASAP cant keep putting production VM's on the host when backups don't work as designed or scheduled.

Screenshot from 2020-03-18 20-17-40.png

Screenshot from 2020-03-18 20-16-29.png

Screenshot from 2020-03-18 20-16-22.png

there is something not quiet right.

""Cheers
G
 

Attachments

  • Screenshot from 2020-03-18 20-17-40.png
    Screenshot from 2020-03-18 20-17-40.png
    6.1 KB · Views: 3
  • Screenshot from 2020-03-18 20-16-29.png
    Screenshot from 2020-03-18 20-16-29.png
    13 KB · Views: 6
The configuration from the screenshots in the GUI does not match with the output for cat /etc/pve/vzdump.cron you provided.
Could you exclude Sunday, save the config, check whether the contents of /etc/pve/vzdump.cron changed, include Sunday again, save the config and check /etc/pve/vzdump.cron again. The entry should start with 45 3 * * * and not 45 3 * * 6 .

I'm unable to reproduce the issue and the logs would probably help.
 
The configuration from the screenshots in the GUI does not match with the output for cat /etc/pve/vzdump.cron you provided.
Could you exclude Sunday, save the config, check whether the contents of /etc/pve/vzdump.cron changed, include Sunday again, save the config and check /etc/pve/vzdump.cron again. The entry should start with 45 3 * * * and not 45 3 * * 6 .

I'm unable to reproduce the issue and the logs would probably help.
Hi @Fabian_E

see below new output.

Code:
~# cat /etc/pve/vzdump.cron
# cluster wide vzdump cron schedule
# Automatically generated file - do not edit

PATH="/usr/sbin:/usr/bin:/sbin:/bin"

45 3 * * *           root vzdump --quiet 1 --mode snapshot --mailnotification always --compress lzo --storage backup --mailto name@domain.com --all 1

Ive completely deleted the old schedule and created a new one, output above.

When i'm seeing is that backups are being deleted every day even when only 1 backup exists.

have just executed a full backup job on the new schedule and will report back with the completed output when its finished.

""Cheers
G
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!