Conflicting maxfiles policy and retention period

The configuration from the screenshots in the GUI does not match with the output for cat /etc/pve/vzdump.cron you provided.
Could you exclude Sunday, save the config, check whether the contents of /etc/pve/vzdump.cron changed, include Sunday again, save the config and check /etc/pve/vzdump.cron again. The entry should start with 45 3 * * * and not 45 3 * * 6 .

I'm unable to reproduce the issue and the logs would probably help.
Hi @Fabian_E

see below output after manual backup:

Code:
# cat /etc/pve/vzdump.cron
# cluster wide vzdump cron schedule
# Automatically generated file - do not edit

PATH="/usr/sbin:/usr/bin:/sbin:/bin"

45 3 * * *           root vzdump --quiet 1 --mode snapshot --mailnotification always --compress lzo --all 1 --mailto user@domain.com --storage backup

we are still seeing backups that where previously on the weekly schedule only have 1 backup in the chain.

VM

104
105
122

and a few others.

this is super frustrating!

you would think that completely removing the scheduled config and creating a new one wold resolve the issue :(

""Cheers
G
 
The config seems fine now and the execution of a backup really shouldn't change it (and it's good that it doesn't).

There might be important information in the log of the bulk job. You should see it in the task viewer (and if not you can locate the log file with the commandscd /var/log/pve/tasks && ls -lt */*vzdump* I already mentioned).
 
The config seems fine now and the execution of a backup really shouldn't change it (and it's good that it doesn't).

There might be important information in the log of the bulk job. You should see it in the task viewer (and if not you can locate the log file with the commandscd /var/log/pve/tasks && ls -lt */*vzdump* I already mentioned).

@Fabian_E
is the attached what you are looking for?

""Cheers
G
 

Attachments

  • 20200320_vzdump_list.txt
    17.3 KB · Views: 4
Hi Team

any news on this one.

Backups are still being deleted when there should be 7.

seriously this issue shouldn't be this hard to diagnose :( :(

that was a sad face.

hoping we can find a solution for this issue soon.

""Cheers
G
 
@Fabian_E
is the attached what you are looking for?

""Cheers
G


These are the log files; you can use them directly as attachments. They are just normal text files, even though the name is a bit weird.
I'm interested in the contents of those files (maybe upload the two most recent ones).

It should start with
Code:
INFO: starting new backup job: vzdump ...
 
These are the log files; you can use them directly as attachments. They are just normal text files, even though the name is a bit weird.
I'm interested in the contents of those files (maybe upload the two most recent ones).

It should start with
Code:
INFO: starting new backup job: vzdump ...
@Fabian_E

this specific one is meant to haev 7 backups, it has Zero.

Code:
# cat 3/UPID:pve:0000728A:07DC241F:5E7737D3:vzdump:103:root@pam:
INFO: starting new backup job: vzdump 103 --mode snapshot --remove 0 --node pve --compress lzo --storage backup
INFO: Starting Backup of VM 103 (qemu)
INFO: Backup started at 2020-03-22 21:03:01
INFO: status = running
INFO: update VM 103: -lock backup
INFO: VM Name: "removed"
INFO: include disk 'virtio0' 'local-lvm:vm-103-disk-0' 60G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-103-2020_03_22-21_03_00.vma.lzo'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '8805988e-84ca-493a-a6da-4689427d7a65'
INFO: status: 0% (171114496/64424509440), sparse 0% (137736192), duration 7, read/write 24/4 MB/s
INFO: status: 1% (646250496/64424509440), sparse 0% (179945472), duration 17, read/write 47/43 MB/s
INFO: status: 2% (1290272768/64424509440), sparse 1% (733122560), duration 20, read/write 214/30 MB/s
INFO: status: 3% (1946222592/64424509440), sparse 1% (895610880), duration 31, read/write 59/44 MB/s
INFO: status: 4% (2745761792/64424509440), sparse 2% (1289449472), duration 41, read/write 79/40 MB/s
INFO: status: 5% (3617390592/64424509440), sparse 3% (2135597056), duration 44, read/write 290/8 MB/s
INFO: status: 7% (4705091584/64424509440), sparse 5% (3223224320), duration 47, read/write 362/0 MB/s
INFO: status: 9% (5807931392/64424509440), sparse 6% (4324241408), duration 50, read/write 367/0 MB/s
INFO: status: 10% (6864830464/64424509440), sparse 8% (5365751808), duration 53, read/write 352/5 MB/s
INFO: status: 11% (7115702272/64424509440), sparse 8% (5371584512), duration 58, read/write 50/49 MB/s
INFO: status: 12% (7739080704/64424509440), sparse 8% (5544472576), duration 67, read/write 69/50 MB/s
INFO: status: 13% (8427077632/64424509440), sparse 8% (5564334080), duration 80, read/write 52/51 MB/s
INFO: status: 14% (9066250240/64424509440), sparse 8% (5586878464), duration 92, read/write 53/51 MB/s
INFO: status: 15% (9708044288/64424509440), sparse 8% (5657542656), duration 103, read/write 58/51 MB/s
INFO: status: 16% (10331422720/64424509440), sparse 8% (5743177728), duration 115, read/write 51/44 MB/s
INFO: status: 17% (10985209856/64424509440), sparse 9% (5826969600), duration 126, read/write 59/51 MB/s
INFO: status: 18% (11796545536/64424509440), sparse 9% (6293438464), duration 134, read/write 101/43 MB/s
INFO: status: 19% (12527206400/64424509440), sparse 10% (6925467648), duration 137, read/write 243/32 MB/s
INFO: status: 21% (13675659264/64424509440), sparse 12% (8073920512), duration 140, read/write 382/0 MB/s
INFO: status: 22% (14185725952/64424509440), sparse 13% (8482607104), duration 143, read/write 170/33 MB/s
INFO: status: 23% (15119089664/64424509440), sparse 14% (9250422784), duration 149, read/write 155/27 MB/s
INFO: status: 25% (16257318912/64424509440), sparse 16% (10364571648), duration 152, read/write 379/8 MB/s
INFO: status: 26% (17006133248/64424509440), sparse 17% (10981445632), duration 156, read/write 187/32 MB/s
INFO: status: 28% (18128961536/64424509440), sparse 18% (12104237056), duration 159, read/write 374/0 MB/s
INFO: status: 29% (18725928960/64424509440), sparse 19% (12577566720), duration 162, read/write 198/41 MB/s
INFO: status: 30% (19814744064/64424509440), sparse 21% (13666381824), duration 165, read/write 362/0 MB/s
INFO: status: 31% (20575354880/64424509440), sparse 22% (14380511232), duration 168, read/write 253/15 MB/s
INFO: status: 33% (21421948928/64424509440), sparse 23% (15157567488), duration 171, read/write 282/23 MB/s
INFO: status: 35% (22644523008/64424509440), sparse 25% (16369283072), duration 174, read/write 407/3 MB/s
INFO: status: 36% (23214358528/64424509440), sparse 26% (16839999488), duration 177, read/write 189/33 MB/s
INFO: status: 37% (24354488320/64424509440), sparse 27% (17980129280), duration 180, read/write 380/0 MB/s
INFO: status: 38% (25075580928/64424509440), sparse 28% (18626314240), duration 183, read/write 240/24 MB/s
INFO: status: 40% (26071072768/64424509440), sparse 30% (19621740544), duration 186, read/write 331/0 MB/s
INFO: status: 41% (27006795776/64424509440), sparse 31% (20522823680), duration 189, read/write 311/11 MB/s
INFO: status: 43% (28104720384/64424509440), sparse 33% (21597626368), duration 192, read/write 365/7 MB/s
INFO: status: 45% (29139206144/64424509440), sparse 35% (22612856832), duration 195, read/write 344/6 MB/s
INFO: status: 46% (30169956352/64424509440), sparse 36% (23608528896), duration 198, read/write 343/11 MB/s
INFO: status: 48% (31286820864/64424509440), sparse 38% (24706326528), duration 201, read/write 372/6 MB/s
INFO: status: 49% (31826116608/64424509440), sparse 38% (24977530880), duration 206, read/write 107/53 MB/s
INFO: status: 51% (33073987584/64424509440), sparse 40% (26225401856), duration 209, read/write 415/0 MB/s
INFO: status: 52% (34120531968/64424509440), sparse 42% (27251699712), duration 212, read/write 348/6 MB/s
INFO: status: 54% (35244277760/64424509440), sparse 44% (28375445504), duration 215, read/write 374/0 MB/s
INFO: status: 56% (36118331392/64424509440), sparse 45% (29202227200), duration 218, read/write 291/15 MB/s
INFO: status: 57% (37343264768/64424509440), sparse 47% (30427160576), duration 221, read/write 408/0 MB/s
INFO: status: 59% (38067306496/64424509440), sparse 48% (31075909632), duration 224, read/write 241/25 MB/s
INFO: status: 60% (39181811712/64424509440), sparse 49% (32190414848), duration 227, read/write 371/0 MB/s
INFO: status: 62% (40123432960/64424509440), sparse 51% (33111166976), duration 230, read/write 313/6 MB/s
INFO: status: 64% (41237020672/64424509440), sparse 53% (34224754688), duration 233, read/write 371/0 MB/s
INFO: status: 65% (42062905344/64424509440), sparse 54% (34991759360), duration 236, read/write 275/19 MB/s
INFO: status: 66% (42629201920/64424509440), sparse 55% (35469336576), duration 240, read/write 141/22 MB/s
INFO: status: 67% (43803475968/64424509440), sparse 56% (36643610624), duration 243, read/write 391/0 MB/s
INFO: status: 69% (44833701888/64424509440), sparse 58% (37659521024), duration 246, read/write 343/4 MB/s
INFO: status: 71% (46164738048/64424509440), sparse 60% (38990557184), duration 249, read/write 443/0 MB/s
INFO: status: 72% (46468366336/64424509440), sparse 60% (39124156416), duration 252, read/write 101/56 MB/s
INFO: status: 73% (47184412672/64424509440), sparse 61% (39750795264), duration 255, read/write 238/29 MB/s
INFO: status: 74% (48227090432/64424509440), sparse 63% (40793473024), duration 258, read/write 347/0 MB/s
INFO: status: 75% (48589373440/64424509440), sparse 63% (41012846592), duration 261, read/write 120/47 MB/s
INFO: status: 76% (49311580160/64424509440), sparse 64% (41675743232), duration 264, read/write 240/19 MB/s
INFO: status: 78% (50409308160/64424509440), sparse 66% (42773471232), duration 267, read/write 365/0 MB/s
INFO: status: 79% (51062374400/64424509440), sparse 67% (43214876672), duration 272, read/write 130/42 MB/s
INFO: status: 81% (52199751680/64424509440), sparse 68% (44352253952), duration 275, read/write 379/0 MB/s
INFO: status: 82% (52880146432/64424509440), sparse 69% (44894842880), duration 279, read/write 170/34 MB/s
INFO: status: 83% (53747253248/64424509440), sparse 70% (45704228864), duration 282, read/write 289/19 MB/s
INFO: status: 85% (54794059776/64424509440), sparse 72% (46748975104), duration 285, read/write 348/0 MB/s
INFO: status: 86% (55685677056/64424509440), sparse 73% (47513333760), duration 289, read/write 222/31 MB/s
INFO: status: 88% (56811585536/64424509440), sparse 75% (48639242240), duration 292, read/write 375/0 MB/s
INFO: status: 89% (57661128704/64424509440), sparse 76% (49362870272), duration 296, read/write 212/31 MB/s
INFO: status: 91% (58711474176/64424509440), sparse 78% (50413215744), duration 299, read/write 350/0 MB/s
INFO: status: 92% (59836727296/64424509440), sparse 79% (51529523200), duration 302, read/write 375/2 MB/s
INFO: status: 94% (60976070656/64424509440), sparse 81% (52668866560), duration 305, read/write 379/0 MB/s
INFO: status: 96% (62106238976/64424509440), sparse 83% (53799018496), duration 308, read/write 376/0 MB/s
INFO: status: 98% (63315509248/64424509440), sparse 85% (55008288768), duration 311, read/write 403/0 MB/s
INFO: status: 100% (64424509440/64424509440), sparse 87% (56117272576), duration 314, read/write 369/0 MB/s
INFO: transferred 64424 MB in 314 seconds (205 MB/s)
INFO: archive file size: 3.96GB
INFO: Finished Backup of VM 103 (00:05:27)
INFO: Backup finished at 2020-03-22 21:08:27
INFO: Backup job finished successfully
TASK OK
 
There does not seem to be removal triggered by that job. Could you run
Code:
grep "delete old" /var/log/pve/tasks/*/*vzdump*
This will show the logs where some removal was triggered.

If you do
Code:
ls -lt $(grep "delete old" /var/log/pve/tasks/*/*vzdump* -l)
you will see the logs that did remove older backups sorted by date. Please provide a recent one of those.

If there are no recent matches, then it's unlikely that vzdump is the one removing the backups. Is there any other service that could be the culprit?

You said that there is no backup for 103. Is there a backup immediately after doing a manual backup? When does it disappear?
 
There does not seem to be removal triggered by that job. Could you run
Code:
grep "delete old" /var/log/pve/tasks/*/*vzdump*

there is no delete old backup for 103.

this is strange as i manually create one and then its gone the next time the backup script runs.

Code:
s# ls -lt $(grep "delete old" /var/log/pve/tasks/*/*vzdump* -l)
-rw-r----- 1 www-data root 135511 Mar 23 05:22 /var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:
-rw-r----- 1 www-data root 134452 Mar 22 05:25 /var/log/pve/tasks/E/UPID:pve:00005955:077D1CF8:5E76448E:vzdump::root@pam:
-rw-r----- 1 www-data root 140625 Mar 21 05:27 /var/log/pve/tasks/E/UPID:pve:0000679B:06F9471F:5E74F30E:vzdump::root@pam:
-rw-r----- 1 www-data root 143913 Mar 20 05:29 /var/log/pve/tasks/E/UPID:pve:00005DE7:06757101:5E73A18E:vzdump::root@pam:
-rw-r----- 1 www-data root    385 Mar 19 18:20 /var/log/pve/tasks/3/UPID:pve:0000593D:0641C9DF:5E731D53:vzdump:102:root@pam:
-rw-r----- 1 www-data root 162185 Mar 19 17:23 /var/log/pve/tasks/8/UPID:pve:000010F4:0632AF01:5E72F6A8:vzdump::root@pam:
-rw-r----- 1 www-data root 162237 Mar 19 05:31 /var/log/pve/tasks/E/UPID:pve:00005A5C:05F19B33:5E72500E:vzdump::root@pam:
-rw-r----- 1 www-data root    385 Mar 18 19:48 /var/log/pve/tasks/F/UPID:pve:00003886:05C600F9:5E71E06F:vzdump:116:root@pam:
-rw-r----- 1 www-data root 150085 Mar 18 05:24 /var/log/pve/tasks/E/UPID:pve:0000527C:056DC50B:5E70FE8E:vzdump::root@pam:
-rw-r----- 1 www-data root 148523 Mar 14 05:27 /var/log/pve/tasks/F/UPID:pve:000023B2:035E6D3A:5E6BB88F:vzdump::root@pam:
-rw-r----- 1 www-data root 141212 Mar  7 16:40 /var/log/pve/tasks/7/UPID:pve:000030D2:0002DEA9:5E632017:vzdump::root@pam:
-rw-r----- 1 www-data root 132961 Mar  7 05:08 /var/log/pve/tasks/E/UPID:pve:0000750E:2767F80B:5E627E0E:vzdump::root@pam:
-rw-r----- 1 www-data root 132128 Mar  6 05:08 /var/log/pve/tasks/E/UPID:pve:00002CEF:26E421BC:5E612C8E:vzdump::root@pam:
-rw-r----- 1 www-data root 134128 Mar  5 05:07 /var/log/pve/tasks/E/UPID:pve:000061E0:26604BEF:5E5FDB0E:vzdump::root@pam:
-rw-r----- 1 www-data root 134657 Mar  4 05:05 /var/log/pve/tasks/E/UPID:pve:00002930:25DC75FE:5E5E898E:vzdump::root@pam:
-rw-r----- 1 www-data root 133286 Mar  3 05:05 /var/log/pve/tasks/E/UPID:pve:0000717C:25589FD8:5E5D380E:vzdump::root@pam:
-rw-r----- 1 www-data root 131011 Mar  2 05:09 /var/log/pve/tasks/E/UPID:pve:0000384E:24D4C9DA:5E5BE68E:vzdump::root@pam:
-rw-r----- 1 www-data root 127683 Mar  2 00:09 /var/log/pve/tasks/F/UPID:pve:00000480:24B9C9BB:5E5BA16F:vzdump::root@pam:
-rw-r----- 1 www-data root  37166 Mar  1 04:38 /var/log/pve/tasks/6/UPID:pve:00000EEF:2453B326:5E5A9C16:vzdump::root@pam:
-rw-r----- 1 www-data root  95294 Feb 29 10:56 /var/log/pve/tasks/2/UPID:pve:0000346E:23EF731D:5E599B72:vzdump::root@pam:
-rw-r----- 1 www-data root  37679 Feb 29 04:38 /var/log/pve/tasks/6/UPID:pve:00006157:23CFDD08:5E594A96:vzdump::root@pam:
-rw-r----- 1 www-data root  38225 Feb 28 04:39 /var/log/pve/tasks/6/UPID:pve:000023EE:234C06D8:5E57F916:vzdump::root@pam:
-rw-r----- 1 www-data root  37998 Feb 27 04:37 /var/log/pve/tasks/6/UPID:pve:00006E51:22C830EE:5E56A796:vzdump::root@pam:
-rw-r----- 1 www-data root  38005 Feb 26 04:38 /var/log/pve/tasks/6/UPID:pve:00003ACF:22445B1A:5E555616:vzdump::root@pam:
-rw-r----- 1 www-data root  38854 Feb 25 04:37 /var/log/pve/tasks/6/UPID:pve:000007FB:21C08514:5E540496:vzdump::root@pam:
-rw-r----- 1 www-data root  38106 Feb 24 04:37 /var/log/pve/tasks/6/UPID:pve:00006993:213CAF2D:5E52B316:vzdump::root@pam:
-rw-r----- 1 www-data root  38739 Feb 23 04:39 /var/log/pve/tasks/6/UPID:pve:00003271:20B8D8DD:5E516196:vzdump::root@pam:
-rw-r----- 1 www-data root  95920 Feb 22 10:57 /var/log/pve/tasks/2/UPID:pve:00005011:20549903:5E5060F2:vzdump::root@pam:
-rw-r----- 1 www-data root  37253 Feb 22 04:37 /var/log/pve/tasks/6/UPID:pve:00006E50:203502CD:5E501016:vzdump::root@pam:
-rw-r----- 1 www-data root  38530 Feb 21 04:38 /var/log/pve/tasks/6/UPID:pve:00003A27:1FB12D14:5E4EBE96:vzdump::root@pam:
-rw-r----- 1 www-data root  38524 Feb 20 04:37 /var/log/pve/tasks/6/UPID:pve:000002E1:1F2D56FB:5E4D6D16:vzdump::root@pam:
-rw-r----- 1 www-data root  37782 Feb 19 04:36 /var/log/pve/tasks/6/UPID:pve:000048B8:1EA98105:5E4C1B96:vzdump::root@pam:
-rw-r----- 1 www-data root  37572 Feb 18 04:36 /var/log/pve/tasks/6/UPID:pve:00000CDC:1E25AB11:5E4ACA16:vzdump::root@pam:
-rw-r----- 1 www-data root  35963 Feb 17 04:36 /var/log/pve/tasks/6/UPID:pve:00003BB2:1DA1D4EA:5E497896:vzdump::root@pam:
-rw-r----- 1 www-data root  39174 Feb 16 04:38 /var/log/pve/tasks/6/UPID:pve:000078FD:1D1DFF02:5E482716:vzdump::root@pam:
-rw-r----- 1 www-data root  96693 Feb 15 10:58 /var/log/pve/tasks/2/UPID:pve:00001045:1CB9BEDD:5E472672:vzdump::root@pam:
-rw-r----- 1 www-data root  38102 Feb 15 04:37 /var/log/pve/tasks/6/UPID:pve:00002BAB:1C9A28F0:5E46D596:vzdump::root@pam:
-rw-r----- 1 www-data root  35962 Feb 14 04:36 /var/log/pve/tasks/6/UPID:pve:00005ACC:1C1652F7:5E458416:vzdump::root@pam:
-rw-r----- 1 www-data root  38523 Feb 13 04:37 /var/log/pve/tasks/6/UPID:pve:00000B21:1B927CDF:5E443296:vzdump::root@pam:
-rw-r----- 1 www-data root  36827 Feb 12 04:36 /var/log/pve/tasks/6/UPID:pve:0000385F:1B0EA6FE:5E42E116:vzdump::root@pam:
-rw-r----- 1 www-data root  38735 Feb 11 04:37 /var/log/pve/tasks/6/UPID:pve:00006FA9:1A8AD0E0:5E418F96:vzdump::root@pam:
-rw-r----- 1 www-data root  38734 Feb 10 04:37 /var/log/pve/tasks/6/UPID:pve:00002863:1A06FB0C:5E403E16:vzdump::root@pam:
-rw-r----- 1 www-data root  37357 Feb  9 04:37 /var/log/pve/tasks/6/UPID:pve:00005F72:1983250C:5E3EEC96:vzdump::root@pam:
-rw-r----- 1 www-data root  22879 Feb  8 16:45 /var/log/pve/tasks/A/UPID:pve:000014A1:1942CB72:5E3E47DA:vzdump::root@pam:
-rw-r----- 1 www-data root  96494 Feb  8 10:57 /var/log/pve/tasks/2/UPID:pve:0000065F:191EE516:5E3DEBF2:vzdump::root@pam:
-rw-r----- 1 www-data root  23072 Feb  8 04:29 /var/log/pve/tasks/6/UPID:pve:00003AD3:18FF4EE4:5E3D9B16:vzdump::root@pam:
-rw-r----- 1 www-data root  23401 Feb  7 04:28 /var/log/pve/tasks/6/UPID:pve:00002409:187B78D2:5E3C4996:vzdump::root@pam:
-rw-r----- 1 www-data root  23078 Feb  6 04:28 /var/log/pve/tasks/6/UPID:pve:00000548:17F7A2D6:5E3AF816:vzdump::root@pam:
-rw-r----- 1 www-data root  22860 Feb  5 04:28 /var/log/pve/tasks/6/UPID:pve:00000889:1773CD20:5E39A696:vzdump::root@pam:
-rw-r----- 1 www-data root  22649 Feb  4 04:28 /var/log/pve/tasks/6/UPID:pve:00004B3E:16EFF6DB:5E385516:vzdump::root@pam:
-rw-r----- 1 www-data root  23390 Feb  3 04:28 /var/log/pve/tasks/6/UPID:pve:00001559:166C2100:5E370396:vzdump::root@pam:
-rw-r----- 1 www-data root  23081 Feb  2 04:30 /var/log/pve/tasks/6/UPID:pve:00006556:15E84B18:5E35B216:vzdump::root@pam:
-rw-r----- 1 www-data root  85988 Feb  1 10:57 /var/log/pve/tasks/2/UPID:pve:0000188D:15840AE2:5E34B172:vzdump::root@pam:
-rw-r----- 1 www-data root  23193 Feb  1 04:30 /var/log/pve/tasks/6/UPID:pve:00003D2D:156474CE:5E346096:vzdump::root@pam:
-rw-r----- 1 www-data root  86641 Jan 25 10:53 /var/log/pve/tasks/2/UPID:pve:00002D3A:11E930FE:5E2B76F2:vzdump::root@pam:
-rw-r----- 1 www-data root  89933 Jan 18 10:55 /var/log/pve/tasks/2/UPID:pve:00003CB4:0E4E56FC:5E223C72:vzdump::root@pam:
-rw-r----- 1 www-data root  90369 Jan 11 10:55 /var/log/pve/tasks/2/UPID:pve:000017D6:0AB37D16:5E1901F2:vzdump::root@pam:
-rw-r----- 1 www-data root  89558 Jan  4 10:54 /var/log/pve/tasks/2/UPID:pve:00002CD9:0718A315:5E0FC772:vzdump::root@pam:
-rw-r----- 1 www-data root  89338 Dec 28 10:54 /var/log/pve/tasks/2/UPID:pve:00001F83:037DC8E1:5E068CF2:vzdump::root@pam:
-rw-r----- 1 www-data root   8162 Dec 28 04:21 /var/log/pve/tasks/6/UPID:pve:0000431A:035E330C:5E063C16:vzdump:122:root@pam:
-rw-r----- 1 www-data root   8260 Dec 27 04:19 /var/log/pve/tasks/6/UPID:pve:00002BCF:02DA5CF0:5E04EA96:vzdump:122:root@pam:
-rw-r----- 1 www-data root   7732 Dec 26 04:18 /var/log/pve/tasks/6/UPID:pve:00001332:0256871A:5E039916:vzdump:122:root@pam:
-rw-r----- 1 www-data root   7411 Dec 25 04:18 /var/log/pve/tasks/6/UPID:pve:000077B5:01D2B0F3:5E024796:vzdump:122:root@pam:
-rw-r----- 1 www-data root   7093 Dec 24 04:18 /var/log/pve/tasks/6/UPID:pve:000051CA:014EDAE5:5E00F616:vzdump:122:root@pam:
-rw-r----- 1 www-data root  88037 Dec 14 10:54 /var/log/pve/tasks/2/UPID:pve:00003361:11FC6F75:5DF417F2:vzdump::root@pam:

this is really strange behavour.

let me know what else is needed?

""Cheers
G
 
@Fabian_E

you can see from the attached screen shot that a backup of 103 was created yesterday 22nd March just under the blue highlighted row, and today its gone.

only the backup i created 10 min ago is now there.

no other backup services are running.

Ive currently disabled the backup schedule to see if the backup is deleted again.

also need to have at least 1 backup for this VM in case of Svr issues we at lease have 1 backup.

""Cheers
G

Screenshot from 2020-03-23 20-44-12.png
 
Code:
/var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:

Could you provide the contents of this file? It's from the job from today that might've deleted the old backup of 103.
 
@Fabian_E

this is the excerpt for 103 below

Code:
INFO: Starting Backup of VM 103 (qemu)
INFO: Backup started at 2020-03-23 04:17:28
INFO: status = running
INFO: update VM 103: -lock backup
INFO: VM Name: name removed
INFO: include disk 'virtio0' 'local-lvm:vm-103-disk-0' 60G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-103-2020_03_23-04_17_27.vma.lzo'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: got timeout
INFO: aborting backup job
ERROR: Backup of VM 103 failed - got timeout
INFO: Failed at 2020-03-23 04:17:52

do you need the entire output for the jobs?

the interesting thing is that taking a manual backup works just fine and completes.

""Cheers
G
 
What's the output of
Code:
grep "delete old" /var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:

Is there a line indicating the removal of the backup for 103?
 
What's the output of
Code:
grep "delete old" /var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:

Is there a line indicating the removal of the backup for 103?

haha nothing as mentioned earlier.

so the question is why is it deleting?

Code:
~# grep "delete old" /var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-100-2020_03_14-03_45_03.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-101-2020_03_14-03_50_42.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-110-2020_03_14-04_38_14.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-112-2020_03_14-04_45_30.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-113-2020_03_14-04_59_18.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-114-2020_03_14-05_02_16.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-115-2020_03_07-16_22_06.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-117-2020_03_14-05_08_13.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-118-2020_03_14-05_09_25.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-120-2020_03_14-05_18_10.vma.lzo'

the backup i took manually last night is still there this morning, so to me this looks like the vzdump script is deleting the backups for 103 even though its not listed in the output.

but this doesn't explain why there are other VM's that only have 1 backup in the chain and others have 7.

they should all have 7 backups in the chain.

Theory.

  • As a VM is deleted its ID is freed up.
  • If a new VM is created it takes the next closes ID in line thats available.
  • With this a VM that may have not have had a schedule or had a different schedule prior to updating the current backup schedule task could be deleted every time a new scheduled backup is taken, why this behavour is happening i have no idea.

thoughts?

""Cheers
G
 
haha nothing as mentioned earlier.

so the question is why is it deleting?

Code:
~# grep "delete old" /var/log/pve/tasks/E/UPID:pve:00003A19:0800F326:5E77960E:vzdump::root@pam:
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-100-2020_03_14-03_45_03.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-101-2020_03_14-03_50_42.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-110-2020_03_14-04_38_14.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-112-2020_03_14-04_45_30.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-113-2020_03_14-04_59_18.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-114-2020_03_14-05_02_16.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-115-2020_03_07-16_22_06.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-117-2020_03_14-05_08_13.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-118-2020_03_14-05_09_25.vma.lzo'
INFO: delete old backup '/mnt/pve/backup/dump/vzdump-qemu-120-2020_03_14-05_18_10.vma.lzo'
Do the VMs appearing in the list all have 7 backups?


the backup i took manually last night is still there this morning, so to me this looks like the vzdump script is deleting the backups for 103 even though its not listed in the output.
Was it still there after the daily vzdump job completed?


but this doesn't explain why there are other VM's that only have 1 backup in the chain and others have 7.

they should all have 7 backups in the chain.
Is there a VM that had two successful backup jobs, but only one backup on the storage? Does the more recent log contain a line indicating the removal of the older backup?


Theory.

  • As a VM is deleted its ID is freed up.
  • If a new VM is created it takes the next closes ID in line thats available.
  • With this a VM that may have not have had a schedule or had a different schedule prior to updating the current backup schedule task could be deleted every time a new scheduled backup is taken, why this behavour is happening i have no idea.

thoughts?

""Cheers
G
The vzdump script has no way to know that it's dealing with a new VM. If there still are backups of an older VM with the same ID, it should act as if they came from the same VM, i.e. remove the old ones once there's more than maxfiles many.
 
There's also some Erase data entries in your task log. Could you share those logs (if you double click you on the entry it'll show you the log)?
 
Hi @Fabian_E

Do the VMs appearing in the list all have 7 backups?

VM's 113, 117, 120 have all been deleted, yes all the remaining VM's have 7 backups.

Was it still there after the daily vzdump job completed?

I stopped the daily VZDump yesterday but prior top this after teh daily VZDump the manual backups was removed and no backups existed not even the once created by the daily VZDump script.

Is there a VM that had two successful backup jobs, but only one backup on the storage? Does the more recent log contain a line indicating the removal of the older backup?

Not sure what you mean by this question can you provide more context or reword the question?

The only way i can answer this question with my understanding is that for example VM 103 when there was a manual backup taken it existed until the VZDump backup schedule ran at 3:45am, the schedule would create another automated backup and when completed would remove both the amnual backup and the automated one.

There's also some Erase data entries in your task log. Could you share those logs (if you double click you on the entry it'll show you the log)?

Code:
# cat /var/log/pve/tasks/2/UPID:pve:00006FAF:080F23CD:5E77BA62:imgdel:103@backup:root@pam:
Removed volume 'backup:backup/vzdump-qemu-103-2020_03_22-21_03_00.vma.lzo'
TASK OK

hope that helps.

""Cheers
G
 
Hi @Fabian_E



VM's 113, 117, 120 have all been deleted, yes all the remaining VM's have 7 backups.
Are there any backups for 113, 117, 120? Are there any Erase data tasks that removed those?

I stopped the daily VZDump yesterday but prior top this after teh daily VZDump the manual backups was removed and no backups existed not even the once created by the daily VZDump script.



Not sure what you mean by this question can you provide more context or reword the question?

The only way i can answer this question with my understanding is that for example VM 103 when there was a manual backup taken it existed until the VZDump backup schedule ran at 3:45am, the schedule would create another automated backup and when completed would remove both the amnual backup and the automated one.



Code:
# cat /var/log/pve/tasks/2/UPID:pve:00006FAF:080F23CD:5E77BA62:imgdel:103@backup:root@pam:
Removed volume 'backup:backup/vzdump-qemu-103-2020_03_22-21_03_00.vma.lzo'
TASK OK

hope that helps.

""Cheers
G

This erase job deleted the manual backup for 103. This is not part of the vzdump job as you can also see from the time it was executed. Please try finding out why those Erase data tasks are there. Is there any automated script issuing pvesm free commands? Anything else you can think of from around the time those tasks happen?
 
Are there any backups for 113, 117, 120? Are there any Erase data tasks that removed those?

yes 7 backups are there.

This erase job deleted the manual backup for 103. This is not part of the vzdump job as you can also see from the time it was executed. Please try finding out why those Erase data tasks are there. Is there any automated script issuing pvesm free commands? Anything else you can think of from around the time those tasks happen?

we have nothing added to default PVE setup so no there shouldn't be anything.

the only thing that has changed is that we had an original schedule which provided a backup 1 per week for a number of VM's.

A new schedule was created for a few different VM's to be daily.

Eventually the weekly schedule was scraped and we moved to a full 7 day nightly backup schedule with the original PVA backup schedule.

noticed the issue and created a post on the forum

after that as a troubleshooting step we deleted the original schedule and created a new one from scratch for all the VM's to be backed up nightly.

thats it.

""Cheers
G
 
Not sure what you mean by this question can you provide more context or reword the question?

I mean backup jobs can fail, as happened for the one for 103 (the excerpt you posted). In the logs, for each VM you'll either see
Code:
Finished Backup of VM <ID>
or
Code:
ERROR: Backup of VM <ID> failed - <ERROR>
If you'd like help with finding the root cause of why it failed, feel free to open up another thread.


What I'm asking is, if there are successful jobs, where the backup isn't there anymore. If there is, please check if there is an Erase data task that removed it. AFAICT in case of backups those tasks can only happen if somebody/something calls pvesm free or if the backup file is deleted manually via the GUI.

The only way i can answer this question with my understanding is that for example VM 103 when there was a manual backup taken it existed until the VZDump backup schedule ran at 3:45am, the schedule would create another automated backup and when completed would remove both the amnual backup and the automated one.

The automated one wasn't successful, that's why it's not there. And the manual one was deleted by an Erase data task.

Could you
Code:
ls -l /var/log/pve/tasks/2/UPID:pve:00006FAF:080F23CD:5E77BA62:imgdel:103@backup:root@pam:
so I can see when that task was executed? Are there often Erase data tasks from around the time the vzdump job runs?
 
Hi @Fabian_E

seeing some strange behavior at the moment.

we have no automated backup tasks running so backups shouldnt be deleted, but every day im seeing the erase task execute and remove backups.

The only vzdump jobs running atm are the ones im manually executing daily per VM.

any ideas ?

see additional notes after the screen shots below:



Screenshot from 2020-03-28 12-34-45.png

Screenshot from 2020-03-28 12-34-39.png

Screenshot from 2020-03-28 12-34-06.png

Screenshot from 2020-03-28 12-33-30.png

As an additional note that i just witnessed happening:

  • Launched a manual backup
  • Tack starter
  • Task ended
  • Then an Erase task started soon after.
  • The erase task deleted the oldest backup.

this shouldn't be happening.

Where is this task/ script created and generated is the question?

""Cheers
G
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!